Search Results for author: Victor Zhong

Found 27 papers, 21 papers with code

OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

no code implementations • 11 Apr 2024 • Tianbao Xie, Danyang Zhang, Jixuan Chen, Xiaochuan Li, Siheng Zhao, Ruisheng Cao, Toh Jing Hua, Zhoujun Cheng, Dongchan Shin, Fangyu Lei, Yitao Liu, Yiheng Xu, Shuyan Zhou, Silvio Savarese, Caiming Xiong, Victor Zhong, Tao Yu

Autonomous agents that accomplish complex computer tasks with minimal human interventions have the potential to transform human-computer interaction, significantly enhancing accessibility and productivity.

Benchmarking

Paper
Add Code

Policy Improvement using Language Feedback Models

no code implementations • 12 Feb 2024 • Victor Zhong, Dipendra Misra, Xingdi Yuan, Marc-Alexandre Côté

We introduce Language Feedback Models (LFMs) that identify desirable behaviour - actions that help achieve tasks specified in the instruction - for imitation learning in instruction following.

Behavioural cloning Instruction Following

Paper
Add Code

Text2Reward: Automated Dense Reward Function Generation for Reinforcement Learning

1 code implementation • 20 Sep 2023 • Tianbao Xie, Siheng Zhao, Chen Henry Wu, Yitao Liu, Qian Luo, Victor Zhong, Yanchao Yang, Tao Yu

Unlike inverse RL and recent work that uses LLMs to write sparse reward codes, Text2Reward produces interpretable, free-form dense reward codes that cover a wide range of tasks, utilize existing packages, and allow iterative refinement with human feedback.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

When Not to Trust Language Models: Investigating Effectiveness of Parametric and Non-Parametric Memories

1 code implementation • 20 Dec 2022 • Alex Mallen, Akari Asai, Victor Zhong, Rajarshi Das, Daniel Khashabi, Hannaneh Hajishirzi

Despite their impressive performance on diverse tasks, large language models (LMs) still struggle with tasks requiring rich world knowledge, implying the limitations of relying solely on their parameters to encode a wealth of world knowledge.

Knowledge Probing Memorization +2

143

Paper
Code

RoMQA: A Benchmark for Robust, Multi-evidence, Multi-answer Question Answering

1 code implementation • 25 Oct 2022 • Victor Zhong, Weijia Shi, Wen-tau Yih, Luke Zettlemoyer

Moreover, existing models are not robust to variations in question constraints, but can be made more robust by tuning on clusters of related questions.

Question Answering Retrieval

Paper
Code

M2D2: A Massively Multi-domain Language Modeling Dataset

1 code implementation • 13 Oct 2022 • Machel Reid, Victor Zhong, Suchin Gururangan, Luke Zettlemoyer

We present M2D2, a fine-grained, massively multi-domain corpus for studying domain adaptation in language models (LMs).

Domain Generalization Language Modelling

Paper
Code

Improving Policy Learning via Language Dynamics Distillation

1 code implementation • 30 Sep 2022 • Victor Zhong, Jesse Mu, Luke Zettlemoyer, Edward Grefenstette, Tim Rocktäschel

Recent work has shown that augmenting environments with language descriptions improves policy learning.

NetHack Reinforcement Learning (RL)

Paper
Code

Improving Intrinsic Exploration with Language Abstractions

1 code implementation • 17 Feb 2022 • Jesse Mu, Victor Zhong, Roberta Raileanu, Minqi Jiang, Noah Goodman, Tim Rocktäschel, Edward Grefenstette

Reinforcement learning (RL) agents are particularly hard to train when rewards are sparse.

reinforcement-learning Reinforcement Learning (RL)

456

Paper
Code

UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models

1 code implementation • 16 Jan 2022 • Tianbao Xie, Chen Henry Wu, Peng Shi, Ruiqi Zhong, Torsten Scholak, Michihiro Yasunaga, Chien-Sheng Wu, Ming Zhong, Pengcheng Yin, Sida I. Wang, Victor Zhong, Bailin Wang, Chengzu Li, Connor Boyle, Ansong Ni, Ziyu Yao, Dragomir Radev, Caiming Xiong, Lingpeng Kong, Rui Zhang, Noah A. Smith, Luke Zettlemoyer, Tao Yu

Structured knowledge grounding (SKG) leverages structured knowledge to complete user requests, such as semantic parsing over databases and question answering over knowledge bases.

Ranked #1 on Task-Oriented Dialogue Systems on KVRET

Few-Shot Learning Question Answering +3

531

Paper
Code

SILG: The Multi-domain Symbolic Interactive Language Grounding Benchmark

no code implementations • NeurIPS 2021 • Victor Zhong, Austin Hanjie, Sida Wang, Karthik Narasimhan, Luke Zettlemoyer

We hope SILG enables the community to quickly identify new methodolo- gies for language grounding that generalize to a diverse set of environments and their associated challenges.

Grounded language learning NetHack

Paper
Add Code

SILG: The Multi-environment Symbolic Interactive Language Grounding Benchmark

1 code implementation • 20 Oct 2021 • Victor Zhong, Austin W. Hanjie, Sida I. Wang, Karthik Narasimhan, Luke Zettlemoyer

We hope SILG enables the community to quickly identify new methodologies for language grounding that generalize to a diverse set of environments and their associated challenges.

Grounded language learning NetHack

Paper
Code

LEWIS: Levenshtein Editing for Unsupervised Text Style Transfer

1 code implementation • Findings (ACL) 2021 • Machel Reid, Victor Zhong

Moreover, compared to previous methods on unsupervised data synthesis, our method results in higher quality parallel style pairs and improves model performance.

Style Transfer Text Style Transfer +1

Paper
Code

Grounding Language to Entities and Dynamics for Generalization in Reinforcement Learning

1 code implementation • 19 Jan 2021 • Austin W. Hanjie, Victor Zhong, Karthik Narasimhan

We investigate the use of natural language to drive the generalization of control policies and introduce the new multi-task environment Messenger with free-form text manuals describing the environment dynamics.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Code

Grounded Adaptation for Zero-shot Executable Semantic Parsing

1 code implementation • EMNLP 2020 • Victor Zhong, Mike Lewis, Sida I. Wang, Luke Zettlemoyer

We propose Grounded Adaptation for Zero-shot Executable Semantic Parsing (GAZP) to adapt an existing semantic parser to new environments (e. g. new database schemas).

Ranked #6 on Text-To-SQL on SParC

Data Augmentation Dialogue State Tracking +2

Paper
Code

RTFM: Generalising to New Environment Dynamics via Reading

no code implementations • ICLR 2020 • Victor Zhong, Tim Rocktäschel, Edward Grefenstette

In this work, we demonstrate that language understanding via a reading policy learner is a promising vehicle for generalisation to new environments.

Paper
Add Code

RTFM: Generalising to Novel Environment Dynamics via Reading

1 code implementation • 18 Oct 2019 • Victor Zhong, Tim Rocktäschel, Edward Grefenstette

In this work, we demonstrate that language understanding via a reading policy learner is a promising vehicle for generalisation to new environments.

Paper
Code

E3: Entailment-driven Extracting and Editing for Conversational Machine Reading

1 code implementation • ACL 2019 • Victor Zhong, Luke Zettlemoyer

Conversational machine reading systems help users answer high-level questions (e. g. determine if they qualify for particular government benefits) when they do not know the exact rules by which the determination is made(e. g. whether they need certain income levels or veteran status).

Reading Comprehension

Paper
Code

Multi-hop Reading Comprehension through Question Decomposition and Rescoring

2 code implementations • ACL 2019 • Sewon Min, Victor Zhong, Luke Zettlemoyer, Hannaneh Hajishirzi

Multi-hop Reading Comprehension (RC) requires reasoning and aggregation across several paragraphs.

Ranked #65 on Question Answering on HotpotQA

Decision Making Multi-Hop Reading Comprehension +2

137

Paper
Code

Coarse-grain Fine-grain Coattention Network for Multi-evidence Question Answering

no code implementations • ICLR 2019 • Victor Zhong, Caiming Xiong, Nitish Shirish Keskar, Richard Socher

End-to-end neural models have made significant progress in question answering, however recent studies show that these models implicitly assume that the answer and evidence appear close together in a single document.

Ranked #5 on Question Answering on WikiHop

Question Answering

Paper
Add Code

Global-Locally Self-Attentive Encoder for Dialogue State Tracking

no code implementations • ACL 2018 • Victor Zhong, Caiming Xiong, Richard Socher

Dialogue state tracking, which estimates user goals and requests given the dialogue context, is an essential part of task-oriented dialogue systems.

Automatic Speech Recognition (ASR) Dialogue State Tracking +3

Paper
Add Code

Efficient and Robust Question Answering from Minimal Context over Documents

1 code implementation • ACL 2018 • Sewon Min, Victor Zhong, Richard Socher, Caiming Xiong

Neural models for question answering (QA) over documents have achieved significant performance improvements.

Ranked #3 on Question Answering on NewsQA

Question Answering Reading Comprehension +2

Paper
Code

Global-Locally Self-Attentive Dialogue State Tracker

2 code implementations • 19 May 2018 • Victor Zhong, Caiming Xiong, Richard Socher

Dialogue state tracking, which estimates user goals and requests given the dialogue context, is an essential part of task-oriented dialogue systems.

Ranked #3 on Dialogue State Tracking on Second dialogue state tracking challenge

Dialogue State Tracking Multi-domain Dialogue State Tracking +1

185

Paper
Code

DCN+: Mixed Objective and Deep Residual Coattention for Question Answering

1 code implementation • ICLR 2018 • Caiming Xiong, Victor Zhong, Richard Socher

Traditional models for question answering optimize using cross entropy loss, which encourages exact answers at the cost of penalizing nearby or overlapping answers that are sometimes equally accurate.

Ranked #28 on Question Answering on SQuAD1.1 dev