Search Results for author: Eric Zelikman

Found 18 papers, 12 papers with code

Self-Supervised Alignment with Mutual Information: Learning to Follow Principles without Preference Labels

1 code implementation • 22 Apr 2024 • Jan-Philipp Fränken, Eric Zelikman, Rafael Rafailov, Kanishk Gandhi, Tobias Gerstenberg, Noah D. Goodman

On single-turn dialogue and summarization, a SAMI-trained mistral-7b outperforms the initial pretrained model, with win rates between 66% and 77%.

Language Modelling

Paper
Code

Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking

1 code implementation • 14 Mar 2024 • Eric Zelikman, Georges Harik, Yijia Shao, Varuna Jayasiri, Nick Haber, Noah D. Goodman

Crucially, these improvements require no fine-tuning on these tasks.

GSM8K Language Modelling +1

292

Paper
Code

Generating and Evaluating Tests for K-12 Students with Language Model Simulations: A Case Study on Sentence Reading Efficiency

no code implementations • 10 Oct 2023 • Eric Zelikman, Wanjing Anya Ma, Jasmine E. Tran, Diyi Yang, Jason D. Yeatman, Nick Haber

Developing an educational test can be expensive and time-consuming, as each item must be written by experts and then evaluated by collecting hundreds of student responses.

Language Modelling Sentence

Paper
Add Code

Self-Taught Optimizer (STOP): Recursively Self-Improving Code Generation

1 code implementation • 3 Oct 2023 • Eric Zelikman, Eliana Lorch, Lester Mackey, Adam Tauman Kalai

In this work, we use a language-model-infused scaffolding program to improve itself.

Code Generation Language Modelling

Paper
Code

ContextRef: Evaluating Referenceless Metrics For Image Description Generation

1 code implementation • 21 Sep 2023 • Elisa Kreiss, Eric Zelikman, Christopher Potts, Nick Haber

None of the methods is successful with ContextRef, but we show that careful fine-tuning yields substantial improvements.

Paper
Code

Hypothesis Search: Inductive Reasoning with Language Models

no code implementations • 11 Sep 2023 • Ruocheng Wang, Eric Zelikman, Gabriel Poesia, Yewen Pu, Nick Haber, Noah D. Goodman

Because of the prohibitive cost of generation with state-of-the-art LLMs, we consider a middle step to filter the set of hypotheses that will be implemented into programs: we either ask the LLM to summarize into a smaller set of hypotheses, or ask human annotators to select a subset of the hypotheses.

In-Context Learning

Paper
Add Code

SkyGPT: Probabilistic Short-term Solar Forecasting Using Synthetic Sky Videos from Physics-constrained VideoGPT

1 code implementation • 20 Jun 2023 • Yuhao Nie, Eric Zelikman, Andea Scott, Quentin Paletta, Adam Brandt

Furthermore, we feed the generated future sky images from the video prediction models for 15-minute-ahead probabilistic solar forecasting for a 30-kW roof-top PV system, and compare it with an end-to-end deep learning baseline model SUNSET and a smart persistence model.

Video Prediction

Paper
Code

Just One Byte (per gradient): A Note on Low-Bandwidth Decentralized Language Model Finetuning Using Shared Randomness

1 code implementation • 16 Jun 2023 • Eric Zelikman, Qian Huang, Percy Liang, Nick Haber, Noah D. Goodman

Language model training in distributed settings is limited by the communication cost of gradient exchanges.

Distributed Optimization Language Modelling

Paper
Code

Certified Deductive Reasoning with Language Models

no code implementations • 6 Jun 2023 • Gabriel Poesia, Kanishk Gandhi, Eric Zelikman, Noah D. Goodman

In experiments on PrOntoQA, ProofWriter and Syllogism Validity datasets, \textsc{LogicGuide} significantly improves the performance of GPT-3, GPT-3. 5 Turbo and LLaMA (accuracy gains up to 35\%), while drastically reducing \emph{content effects} -- the interference between unwanted prior assumptions and reasoning, which humans and language models suffer from.

Logical Reasoning valid

Paper
Add Code

Parsel: Algorithmic Reasoning with Language Models by Composing Decompositions

1 code implementation • 20 Dec 2022 • Eric Zelikman, Qian Huang, Gabriel Poesia, Noah D. Goodman, Nick Haber

Despite recent success in large language model (LLM) reasoning, LLMs struggle with hierarchical multi-step reasoning tasks like generating complex programs.

Ranked #8 on Code Generation on HumanEval

Automated Theorem Proving Code Generation +4

375

Paper
Code

Holistic Evaluation of Language Models

1 code implementation • 16 Nov 2022 • Percy Liang, Rishi Bommasani, Tony Lee, Dimitris Tsipras, Dilara Soylu, Michihiro Yasunaga, Yian Zhang, Deepak Narayanan, Yuhuai Wu, Ananya Kumar, Benjamin Newman, Binhang Yuan, Bobby Yan, Ce Zhang, Christian Cosgrove, Christopher D. Manning, Christopher Ré, Diana Acosta-Navas, Drew A. Hudson, Eric Zelikman, Esin Durmus, Faisal Ladhak, Frieda Rong, Hongyu Ren, Huaxiu Yao, Jue Wang, Keshav Santhanam, Laurel Orr, Lucia Zheng, Mert Yuksekgonul, Mirac Suzgun, Nathan Kim, Neel Guha, Niladri Chatterji, Omar Khattab, Peter Henderson, Qian Huang, Ryan Chi, Sang Michael Xie, Shibani Santurkar, Surya Ganguli, Tatsunori Hashimoto, Thomas Icard, Tianyi Zhang, Vishrav Chaudhary, William Wang, Xuechen Li, Yifan Mai, Yuhui Zhang, Yuta Koreeda

We present Holistic Evaluation of Language Models (HELM) to improve the transparency of language models.

Fairness Question Answering

1,659

Paper
Code

Context Matters for Image Descriptions for Accessibility: Challenges for Referenceless Evaluation Metrics

1 code implementation • 21 May 2022 • Elisa Kreiss, Cynthia Bennett, Shayan Hooshmand, Eric Zelikman, Meredith Ringel Morris, Christopher Potts

Few images on the Web receive alt-text descriptions that would make them accessible to blind and low vision (BLV) users.

Paper
Code

STaR: Bootstrapping Reasoning With Reasoning

1 code implementation • 28 Mar 2022 • Eric Zelikman, Yuhuai Wu, Jesse Mu, Noah D. Goodman

We show that STaR significantly improves performance on multiple datasets compared to a model fine-tuned to directly predict final answers, and performs comparably to fine-tuning a 30$\times$ larger state-of-the-art language model on CommensenseQA.

Ranked #17 on Common Sense Reasoning on CommonsenseQA

Common Sense Reasoning Language Modelling +1

Paper
Code

Short-Term Solar Irradiance Forecasting Using Calibrated Probabilistic Models

no code implementations • 9 Oct 2020 • Eric Zelikman, Sharon Zhou, Jeremy Irvin, Cooper Raterink, Hao Sheng, Anand Avati, Jack Kelly, Ram Rajagopal, Andrew Y. Ng, David Gagne

Advancing probabilistic solar forecasting methods is essential to supporting the integration of solar energy into the electricity grid.

Solar Irradiance Forecasting

Paper
Add Code

Evaluating the Disentanglement of Deep Generative Models through Manifold Topology

1 code implementation • ICLR 2021 • Sharon Zhou, Eric Zelikman, Fred Lu, Andrew Y. Ng, Gunnar Carlsson, Stefano Ermon

Learning disentangled representations is regarded as a fundamental task for improving the generalization, robustness, and interpretability of generative models.

Disentanglement

Paper
Code

CRUDE: Calibrating Regression Uncertainty Distributions Empirically

no code implementations • 26 May 2020 • Eric Zelikman, Christopher Healy, Sharon Zhou, Anand Avati

Calibrated uncertainty estimates in machine learning are crucial to many fields such as autonomous vehicles, medicine, and weather and climate forecasting.

Autonomous Vehicles General Classification +1

Paper
Add Code

Learning as Reinforcement: Applying Principles of Neuroscience for More General Reinforcement Learning Agents

no code implementations • 20 Apr 2020 • Eric Zelikman, William Yin, Kenneth Wang

A significant challenge in developing AI that can generalize well is designing agents that learn about their world without being told what to learn, and apply that learning to challenges with sparse rewards.

Decision Making General Reinforcement Learning +2

Paper
Add Code

Contextual Salience for Fast and Accurate Sentence Vectors

1 code implementation • 22 Mar 2018 • Eric Zelikman, Richard Socher

We introduce contextual salience (CoSal), a measure of word importance that uses the distribution of context vectors to normalize distances and weights.

Document Summarization General Classification +4

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.