1 code implementation • 31 May 2024 • Elias Stengel-Eskin, Peter Hase, Mohit Bansal
To calibrate both implicit and explicit confidence markers, we introduce a pragmatic, listener-aware finetuning method (LACIE) that models the listener, considering not only whether an answer is right, but whether it will be accepted by a listener.
1 code implementation • 29 May 2024 • Ziyang Wang, Shoubin Yu, Elias Stengel-Eskin, Jaehong Yoon, Feng Cheng, Gedas Bertasius, Mohit Bansal
Recently, many long video-language understanding approaches have leveraged the reasoning capabilities of Large Language Models (LLMs) to perform long video QA, transforming videos into densely sampled frame captions, and asking LLMs to respond to text queries over captions.
Ranked #1 on Zero-Shot Video Question Answer on IntentQA
1 code implementation • 4 May 2024 • Maryam Hashemzadeh, Elias Stengel-Eskin, Sarath Chandar, Marc-Alexandre Cote
While Large Language Models (LLMs) have demonstrated significant promise as agents in interactive tasks, their substantial computational requirements and restricted number of calls constrain their practical utility, especially in long-horizon interactive tasks such as decision-making or in scenarios involving continuous ongoing tasks.
no code implementations • 4 Mar 2024 • David Wan, Jaemin Cho, Elias Stengel-Eskin, Mohit Bansal
Highlighting particularly relevant regions of an image can improve the performance of vision-language models (VLMs) on various vision-language (VL) tasks by guiding the model to attend more closely to these regions of interest.
no code implementations • 26 Feb 2024 • Haotian Fu, Pratyusha Sharma, Elias Stengel-Eskin, George Konidaris, Nicolas Le Roux, Marc-Alexandre Côté, Xingdi Yuan
We present an algorithm for skill discovery from expert demonstrations.
1 code implementation • 20 Feb 2024 • Han Wang, Archiki Prasad, Elias Stengel-Eskin, Mohit Bansal
Current "sample and select" methods such as self-consistency (SC) rely on majority voting to score answers.
1 code implementation • 19 Feb 2024 • Jinhao Duan, Renming Zhang, James Diffenderfer, Bhavya Kailkhura, Lichao Sun, Elias Stengel-Eskin, Mohit Bansal, Tianlong Chen, Kaidi Xu
As Large Language Models (LLMs) are integrated into critical real-world applications, their strategic and logical reasoning abilities are increasingly crucial.
1 code implementation • 2 Feb 2024 • Justin Chih-Yao Chen, Swarnadeep Saha, Elias Stengel-Eskin, Mohit Bansal
Experiments on seven widely-used commonsense and math reasoning benchmarks show that MAGDi improves the reasoning capabilities of smaller models, outperforming several methods that distill from a single teacher and multiple teachers.
1 code implementation • 29 Jan 2024 • Elias Stengel-Eskin, Archiki Prasad, Mohit Bansal
While large language models (LLMs) are increasingly being used for program synthesis, they lack the global view needed to develop useful abstractions; they generally predict programs one at a time, often repeating the same functionality.
1 code implementation • 9 Oct 2023 • Archiki Prasad, Elias Stengel-Eskin, Mohit Bansal
An increasing number of vision-language tasks can be handled with little to no training, i. e., in a zero and few-shot manner, by marrying large language models (LLMs) to vision encoders, resulting in large vision-language models (LVLMs).
1 code implementation • 1 Jun 2023 • Elias Stengel-Eskin, Kyle Rawlins, Benjamin Van Durme
We attempt to address this shortcoming by introducing AmP, a framework, dataset, and challenge for translating ambiguous natural language to formal representations like logic and code.
no code implementations • 29 Mar 2023 • Elias Stengel-Eskin, Benjamin Van Durme
We then examine how confidence scores can help optimize the trade-off between usability and safety.
2 code implementations • CVPR 2023 • Zhuowan Li, Xingrui Wang, Elias Stengel-Eskin, Adam Kortylewski, Wufei Ma, Benjamin Van Durme, Alan Yuille
Visual Question Answering (VQA) models often perform poorly on out-of-distribution data and struggle on domain generalization.
1 code implementation • 14 Nov 2022 • Elias Stengel-Eskin, Jimena Guallar-Blasco, Yi Zhou, Benjamin Van Durme
Natural language is ambiguous.
2 code implementations • 14 Nov 2022 • Elias Stengel-Eskin, Benjamin Van Durme
Sequence generation models are increasingly being used to translate natural language into programs, i. e. to perform executable semantic parsing.
1 code implementation • 24 May 2022 • Elias Stengel-Eskin, Benjamin Van Durme
Given the advanced fluency of large generative language models, we ask whether model outputs are consistent with these heuristics, and to what degree different models are consistent with each other.
1 code implementation • 24 May 2022 • Elias Stengel-Eskin, Emmanouil Antonios Platanios, Adam Pauls, Sam Thomson, Hao Fang, Benjamin Van Durme, Jason Eisner, Yu Su
Rejecting class imbalance as the sole culprit, we reveal that the trend is closely associated with an effect we call source signal dilution, where strong lexical cues for the new symbol become diluted as the training dataset grows.
1 code implementation • NAACL 2022 • Chenyu Zhang, Benjamin Van Durme, Zhuowan Li, Elias Stengel-Eskin
Our commonsense knowledge about objects includes their typical visual attributes; we know that bananas are typically yellow or green, and not purple.
Ranked #1 on Visual Commonsense Tests on ViComTe-color
2 code implementations • Conference On Robot Learning (CoRL) 2021 • Elias Stengel-Eskin, Andrew Hundt, Zhuohong He, Aditya Murali, Nakul Gopalan, Matthew Gombolay, Gregory Hager
Our model completes block manipulation tasks with synthetic commands 530 more often than a UNet-based baseline, and learns to localize actions correctly while creating a mapping of symbols to perceptual input that supports compositional reasoning.
1 code implementation • ICCV 2021 • Zhuowan Li, Elias Stengel-Eskin, Yixiao Zhang, Cihang Xie, Quan Tran, Benjamin Van Durme, Alan Yuille
Our experiments show CCO substantially boosts the performance of neural symbolic methods on real images.
1 code implementation • 12 Apr 2021 • Elias Stengel-Eskin, Kenton Murray, Sheng Zhang, Aaron Steven White, Benjamin Van Durme
While numerous attempts have been made to jointly parse syntax and semantics, high performance in one domain typically comes at the price of performance in the other.
no code implementations • 1 Jul 2020 • Ryan Culkin, J. Edward Hu, Elias Stengel-Eskin, Guanghui Qin, Benjamin Van Durme
We introduce a novel paraphrastic augmentation strategy based on sentence-level lexically constrained paraphrasing and discriminative span alignment.
no code implementations • ACL 2020 • Elias Stengel-Eskin, Aaron Steven White, Sheng Zhang, Benjamin Van Durme
We introduce a transductive model for parsing into Universal Decompositional Semantics (UDS) representations, which jointly learns to map natural language utterances into UDS graph structures and annotate the graph with decompositional semantic attribute scores.
1 code implementation • LREC 2020 • Aaron Steven White, Elias Stengel-Eskin, Siddharth Vashishtha, Venkata Govindarajan, Dee Ann Reisinger, Tim Vieira, Keisuke Sakaguchi, Sheng Zhang, Francis Ferraro, Rachel Rudinger, Kyle Rawlins, Benjamin Van Durme
We present the Universal Decompositional Semantics (UDS) dataset (v1. 0), which is bundled with the Decomp toolkit (v0. 1).
no code implementations • IJCNLP 2019 • Elias Stengel-Eskin, Tzu-Ray Su, Matt Post, Benjamin Van Durme
We introduce a novel discriminative word alignment model, which we integrate into a Transformer-based machine translation model.