Search Results for author: Pang Wei Koh

Found 33 papers, 25 papers with code

Overparameterization hurts worst-group accuracy with spurious correlations

no code implementations • ICML 2020 • Shiori Sagawa, aditi raghunathan, Pang Wei Koh, Percy Liang

Increasing model capacity well beyond the point of zero training error has been observed to improve average test accuracy.

Paper
Add Code

Information-Theoretic Distillation for Reference-less Summarization

no code implementations • 20 Mar 2024 • JaeHun Jung, Ximing Lu, Liwei Jiang, Faeze Brahman, Peter West, Pang Wei Koh, Yejin Choi

The current winning recipe for automatic summarization is using proprietary large-scale language models (LLMs) such as ChatGPT as is, or imitation learning from them as teacher models.

Imitation Learning

Paper
Add Code

Reliable, Adaptable, and Attributable Language Models with Retrieval

no code implementations • 5 Mar 2024 • Akari Asai, Zexuan Zhong, Danqi Chen, Pang Wei Koh, Luke Zettlemoyer, Hannaneh Hajishirzi, Wen-tau Yih

Parametric language models (LMs), which are trained on vast amounts of web data, exhibit remarkable flexibility and capability.

Question Answering Retrieval

Paper
Add Code

Uncertainty of Thoughts: Uncertainty-Aware Planning Enhances Information Seeking in Large Language Models

1 code implementation • 5 Feb 2024 • Zhiyuan Hu, Chumin Liu, Xidong Feng, Yilun Zhao, See-Kiong Ng, Anh Tuan Luu, Junxian He, Pang Wei Koh, Bryan Hooi

In the face of uncertainty, the ability to seek information is of fundamental importance.

Medical Diagnosis

Paper
Code

Instructional Fingerprinting of Large Language Models

1 code implementation • 21 Jan 2024 • Jiashu Xu, Fei Wang, Mingyu Derek Ma, Pang Wei Koh, Chaowei Xiao, Muhao Chen

The exorbitant cost of training Large language models (LLMs) from scratch makes it essential to fingerprint the models to protect intellectual property via ownership authentication and to ensure downstream users and developers comply with their license terms (e. g. restricting commercial use).

Paper
Code

The Generative AI Paradox: "What It Can Create, It May Not Understand"

no code implementations • 31 Oct 2023 • Peter West, Ximing Lu, Nouha Dziri, Faeze Brahman, Linjie Li, Jena D. Hwang, Liwei Jiang, Jillian Fisher, Abhilasha Ravichander, Khyathi Chandu, Benjamin Newman, Pang Wei Koh, Allyson Ettinger, Yejin Choi

Specifically, we propose and test the Generative AI Paradox hypothesis: generative models, having been trained directly to reproduce expert-like outputs, acquire generative capabilities that are not contingent upon -- and can therefore exceed -- their ability to understand those same types of outputs.

Paper
Add Code

OpenFlamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language Models

2 code implementations • 2 Aug 2023 • Anas Awadalla, Irena Gao, Josh Gardner, Jack Hessel, Yusuf Hanafy, Wanrong Zhu, Kalyani Marathe, Yonatan Bitton, Samir Gadre, Shiori Sagawa, Jenia Jitsev, Simon Kornblith, Pang Wei Koh, Gabriel Ilharco, Mitchell Wortsman, Ludwig Schmidt

We introduce OpenFlamingo, a family of autoregressive vision-language models ranging from 3B to 9B parameters.

Ranked #14 on Visual Question Answering (VQA) on InfiMM-Eval

Visual Question Answering

3,485

Paper
Code

Are aligned neural networks adversarially aligned?

no code implementations • NeurIPS 2023 • Nicholas Carlini, Milad Nasr, Christopher A. Choquette-Choo, Matthew Jagielski, Irena Gao, Anas Awadalla, Pang Wei Koh, Daphne Ippolito, Katherine Lee, Florian Tramer, Ludwig Schmidt

We show that existing NLP-based optimization attacks are insufficiently powerful to reliably attack aligned text models: even when current NLP-based attacks fail, we can find adversarial inputs with brute force.

Paper
Add Code

Proximity-Informed Calibration for Deep Neural Networks

1 code implementation • NeurIPS 2023 • Miao Xiong, Ailin Deng, Pang Wei Koh, Jiaying Wu, Shen Li, Jianqing Xu, Bryan Hooi

We examine the problem over 504 pretrained ImageNet models and observe that: 1) Proximity bias exists across a wide variety of model architectures and sizes; 2) Transformer-based models are relatively more susceptible to proximity bias than CNN-based models; 3) Proximity bias persists even after performing popular calibration algorithms like temperature scaling; 4) Models tend to overfit more heavily on low proximity samples than on high proximity samples.

Paper
Code

FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation

5 code implementations • 23 May 2023 • Sewon Min, Kalpesh Krishna, Xinxi Lyu, Mike Lewis, Wen-tau Yih, Pang Wei Koh, Mohit Iyyer, Luke Zettlemoyer, Hannaneh Hajishirzi

Evaluating the factuality of long-form text generated by large language models (LMs) is non-trivial because (1) generations often contain a mixture of supported and unsupported pieces of information, making binary judgments of quality inadequate, and (2) human evaluation is time-consuming and costly.

Language Modelling Retrieval +1

272

Paper
Code

On the Trade-off of Intra-/Inter-class Diversity for Supervised Pre-training

no code implementations • NeurIPS 2023 • Jieyu Zhang, Bohan Wang, Zhengyu Hu, Pang Wei Koh, Alexander Ratner

Pre-training datasets are critical for building state-of-the-art machine learning models, motivating rigorous study on their impact on downstream tasks.

Paper
Add Code

DataComp: In search of the next generation of multimodal datasets

1 code implementation • NeurIPS 2023 • Samir Yitzhak Gadre, Gabriel Ilharco, Alex Fang, Jonathan Hayase, Georgios Smyrnis, Thao Nguyen, Ryan Marten, Mitchell Wortsman, Dhruba Ghosh, Jieyu Zhang, Eyal Orgad, Rahim Entezari, Giannis Daras, Sarah Pratt, Vivek Ramanujan, Yonatan Bitton, Kalyani Marathe, Stephen Mussmann, Richard Vencu, Mehdi Cherti, Ranjay Krishna, Pang Wei Koh, Olga Saukh, Alexander Ratner, Shuran Song, Hannaneh Hajishirzi, Ali Farhadi, Romain Beaumont, Sewoong Oh, Alex Dimakis, Jenia Jitsev, Yair Carmon, Vaishaal Shankar, Ludwig Schmidt

Multimodal datasets are a critical component in recent breakthroughs such as Stable Diffusion and GPT-4, yet their design does not receive the same research attention as model architectures or training algorithms.

Paper
Code

Out-of-Domain Robustness via Targeted Augmentations

1 code implementation • 23 Feb 2023 • Irena Gao, Shiori Sagawa, Pang Wei Koh, Tatsunori Hashimoto, Percy Liang

Models trained on one set of domains often suffer performance drops on unseen domains, e. g., when wildlife monitoring models are deployed in new camera locations.

Paper
Code

Improving Domain Generalization with Domain Relations

no code implementations • 6 Feb 2023 • Huaxiu Yao, Xinyu Yang, Xinyi Pan, Shengchao Liu, Pang Wei Koh, Chelsea Finn

Distribution shift presents a significant challenge in machine learning, where models often underperform during the test stage when faced with a different distribution than the one they were trained on.

Domain Generalization

Paper
Add Code

Impossibility Theorems for Feature Attribution

1 code implementation • 22 Dec 2022 • Blair Bilodeau, Natasha Jaques, Pang Wei Koh, Been Kim

Despite a sea of interpretability methods that can produce plausible explanations, the field has also empirically seen many failure cases of such methods.

Paper
Code

Wild-Time: A Benchmark of in-the-Wild Distribution Shift over Time

1 code implementation • 25 Nov 2022 • Huaxiu Yao, Caroline Choi, Bochuan Cao, Yoonho Lee, Pang Wei Koh, Chelsea Finn

Temporal shifts -- distribution shifts arising from the passage of time -- often occur gradually and have the additional structure of timestamp metadata.

Continual Learning Domain Generalization +3

Paper
Code

Extending the WILDS Benchmark for Unsupervised Adaptation

1 code implementation • ICLR 2022 • Shiori Sagawa, Pang Wei Koh, Tony Lee, Irena Gao, Sang Michael Xie, Kendrick Shen, Ananya Kumar, Weihua Hu, Michihiro Yasunaga, Henrik Marklund, Sara Beery, Etienne David, Ian Stavness, Wei Guo, Jure Leskovec, Kate Saenko, Tatsunori Hashimoto, Sergey Levine, Chelsea Finn, Percy Liang

Unlabeled data can be a powerful point of leverage for mitigating these distribution shifts, as it is frequently much more available than labeled data and can often be obtained from distributions beyond the source distribution as well.

Paper
Code

On the Opportunities and Risks of Foundation Models

2 code implementations • 16 Aug 2021 • Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch, Dallas Card, Rodrigo Castellon, Niladri Chatterji, Annie Chen, Kathleen Creel, Jared Quincy Davis, Dora Demszky, Chris Donahue, Moussa Doumbouya, Esin Durmus, Stefano Ermon, John Etchemendy, Kawin Ethayarajh, Li Fei-Fei, Chelsea Finn, Trevor Gale, Lauren Gillespie, Karan Goel, Noah Goodman, Shelby Grossman, Neel Guha, Tatsunori Hashimoto, Peter Henderson, John Hewitt, Daniel E. Ho, Jenny Hong, Kyle Hsu, Jing Huang, Thomas Icard, Saahil Jain, Dan Jurafsky, Pratyusha Kalluri, Siddharth Karamcheti, Geoff Keeling, Fereshte Khani, Omar Khattab, Pang Wei Koh, Mark Krass, Ranjay Krishna, Rohith Kuditipudi, Ananya Kumar, Faisal Ladhak, Mina Lee, Tony Lee, Jure Leskovec, Isabelle Levent, Xiang Lisa Li, Xuechen Li, Tengyu Ma, Ali Malik, Christopher D. Manning, Suvir Mirchandani, Eric Mitchell, Zanele Munyikwa, Suraj Nair, Avanika Narayan, Deepak Narayanan, Ben Newman, Allen Nie, Juan Carlos Niebles, Hamed Nilforoshan, Julian Nyarko, Giray Ogut, Laurel Orr, Isabel Papadimitriou, Joon Sung Park, Chris Piech, Eva Portelance, Christopher Potts, aditi raghunathan, Rob Reich, Hongyu Ren, Frieda Rong, Yusuf Roohani, Camilo Ruiz, Jack Ryan, Christopher Ré, Dorsa Sadigh, Shiori Sagawa, Keshav Santhanam, Andy Shih, Krishnan Srinivasan, Alex Tamkin, Rohan Taori, Armin W. Thomas, Florian Tramèr, Rose E. Wang, William Wang, Bohan Wu, Jiajun Wu, Yuhuai Wu, Sang Michael Xie, Michihiro Yasunaga, Jiaxuan You, Matei Zaharia, Michael Zhang, Tianyi Zhang, Xikun Zhang, Yuhui Zhang, Lucia Zheng, Kaitlyn Zhou, Percy Liang

AI is undergoing a paradigm shift with the rise of models (e. g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks.

Transfer Learning

850

Paper
Code

Just Train Twice: Improving Group Robustness without Training Group Information

1 code implementation • 19 Jul 2021 • Evan Zheran Liu, Behzad Haghgoo, Annie S. Chen, aditi raghunathan, Pang Wei Koh, Shiori Sagawa, Percy Liang, Chelsea Finn

Standard training via empirical risk minimization (ERM) can produce models that achieve high accuracy on average but low accuracy on certain groups, especially in the presence of spurious correlations between the input and label.

Ranked #1 on Out-of-Distribution Generalization on ImageNet-W

Image Classification Out-of-Distribution Generalization

Paper
Code

Accuracy on the Line: On the Strong Correlation Between Out-of-Distribution and In-Distribution Generalization

1 code implementation • 9 Jul 2021 • John Miller, Rohan Taori, aditi raghunathan, Shiori Sagawa, Pang Wei Koh, Vaishaal Shankar, Percy Liang, Yair Carmon, Ludwig Schmidt

For machine learning systems to be reliable, we must understand their performance in unseen, out-of-distribution environments.

Classification Domain Adaptation +1

112

Paper
Code

WILDS: A Benchmark of in-the-Wild Distribution Shifts

6 code implementations • 14 Dec 2020 • Pang Wei Koh, Shiori Sagawa, Henrik Marklund, Sang Michael Xie, Marvin Zhang, Akshay Balsubramani, Weihua Hu, Michihiro Yasunaga, Richard Lanas Phillips, Irena Gao, Tony Lee, Etienne David, Ian Stavness, Wei Guo, Berton A. Earnshaw, Imran S. Haque, Sara Beery, Jure Leskovec, Anshul Kundaje, Emma Pierson, Sergey Levine, Chelsea Finn, Percy Liang

Distribution shifts -- where the training distribution differs from the test distribution -- can substantially degrade the accuracy of machine learning (ML) systems deployed in the wild.

1,333

Paper
Code

Selective Classification Can Magnify Disparities Across Groups

1 code implementation • ICLR 2021 • Erik Jones, Shiori Sagawa, Pang Wei Koh, Ananya Kumar, Percy Liang

In this paper, we find that while selective classification can improve average accuracies, it can simultaneously magnify existing accuracy disparities between various groups within a population, especially in the presence of spurious correlations.

Classification General Classification

Paper
Code

Concept Bottleneck Models

4 code implementations • ICML 2020 • Pang Wei Koh, Thao Nguyen, Yew Siang Tang, Stephen Mussmann, Emma Pierson, Been Kim, Percy Liang

We seek to learn models that we can interact with using high-level concepts: if the model did not think there was a bone spur in the x-ray, would it still predict severe arthritis?

157

Paper
Code

An Investigation of Why Overparameterization Exacerbates Spurious Correlations

3 code implementations • 9 May 2020 • Shiori Sagawa, aditi raghunathan, Pang Wei Koh, Percy Liang

We study why overparameterization -- increasing model size well beyond the point of zero training error -- can hurt test error on minority groups despite improving average test error when there are spurious correlations in the data.

Inductive Bias

Paper
Code

ExpBERT: Representation Engineering with Natural Language Explanations

2 code implementations • ACL 2020 • Shikhar Murty, Pang Wei Koh, Percy Liang

Suppose we want to specify the inductive bias that married couples typically go on honeymoons for the task of extracting pairs of spouses from text.

Inductive Bias Relation Extraction +1

Paper
Code

Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims

no code implementations • 15 Apr 2020 • Miles Brundage, Shahar Avin, Jasmine Wang, Haydn Belfield, Gretchen Krueger, Gillian Hadfield, Heidy Khlaaf, Jingying Yang, Helen Toner, Ruth Fong, Tegan Maharaj, Pang Wei Koh, Sara Hooker, Jade Leung, Andrew Trask, Emma Bluemke, Jonathan Lebensbold, Cullen O'Keefe, Mark Koren, Théo Ryffel, JB Rubinovitz, Tamay Besiroglu, Federica Carugati, Jack Clark, Peter Eckersley, Sarah de Haas, Maritza Johnson, Ben Laurie, Alex Ingerman, Igor Krawczuk, Amanda Askell, Rosario Cammarota, Andrew Lohn, David Krueger, Charlotte Stix, Peter Henderson, Logan Graham, Carina Prunkl, Bianca Martin, Elizabeth Seger, Noa Zilberman, Seán Ó hÉigeartaigh, Frens Kroeger, Girish Sastry, Rebecca Kagan, Adrian Weller, Brian Tse, Elizabeth Barnes, Allan Dafoe, Paul Scharre, Ariel Herbert-Voss, Martijn Rasser, Shagun Sodhani, Carrick Flynn, Thomas Krendl Gilbert, Lisa Dyer, Saif Khan, Yoshua Bengio, Markus Anderljung

With the recent wave of progress in artificial intelligence (AI) has come a growing awareness of the large-scale impacts of AI systems, and recognition that existing regulations and norms in industry and academia are insufficient to ensure responsible AI development.

Computers and Society

Paper
Add Code

Distributionally Robust Neural Networks for Group Shifts: On the Importance of Regularization for Worst-Case Generalization

8 code implementations • 20 Nov 2019 • Shiori Sagawa, Pang Wei Koh, Tatsunori B. Hashimoto, Percy Liang

Distributionally robust optimization (DRO) allows us to learn models that instead minimize the worst-case training loss over a set of pre-defined groups.

Ranked #1 on Out-of-Distribution Generalization on UrbanCars

Domain Generalization Natural Language Inference +2

1,333

Paper
Code

Temporal FiLM: Capturing Long-Range Sequence Dependencies with Feature-Wise Modulations

1 code implementation • 14 Sep 2019 • Sawyer Birnbaum, Volodymyr Kuleshov, Zayd Enam, Pang Wei Koh, Stefano Ermon

Learning representations that accurately capture long-range dependencies in sequential inputs -- including text, audio, and genomic data -- is a key problem in deep learning.

Ranked #2 on Audio Super-Resolution on Voice Bank corpus (VCTK) (using extra training data)

Audio Super-Resolution Super-Resolution +2

Paper
Code

On the Accuracy of Influence Functions for Measuring Group Effects

2 code implementations • NeurIPS 2019 • Pang Wei Koh, Kai-Siang Ang, Hubert H. K. Teo, Percy Liang

Influence functions estimate the effect of removing a training point on a model without the need to retrain.

Influence Approximation

Paper
Code

Stronger Data Poisoning Attacks Break Data Sanitization Defenses

2 code implementations • 2 Nov 2018 • Pang Wei Koh, Jacob Steinhardt, Percy Liang

In this paper, we develop three attacks that can bypass a broad range of common data sanitization defenses, including anomaly detectors based on nearest neighbors, training loss, and singular-value decomposition.

Data Poisoning Sentiment Analysis +2

Paper
Code

Inferring Multidimensional Rates of Aging from Cross-Sectional Data

1 code implementation • 12 Jul 2018 • Emma Pierson, Pang Wei Koh, Tatsunori Hashimoto, Daphne Koller, Jure Leskovec, Nicholas Eriksson, Percy Liang

Motivated by the study of human aging, we present an interpretable latent-variable model that learns temporal dynamics from cross-sectional data.

Human Aging Time Series +1

Paper
Code

Certified Defenses for Data Poisoning Attacks

2 code implementations • NeurIPS 2017 • Jacob Steinhardt, Pang Wei Koh, Percy Liang

Machine learning systems trained on user-provided data are susceptible to data poisoning attacks, whereby malicious users inject false training data with the aim of corrupting the learned model.

Data Poisoning

Paper
Code

Understanding Black-box Predictions via Influence Functions

19 code implementations • ICML 2017 • Pang Wei Koh, Percy Liang

How can we explain the predictions of a black-box model?

761

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.