1 code implementation • 9 Apr 2024 • Jiayi Pan, Yichi Zhang, Nicholas Tomlin, Yifei Zhou, Sergey Levine, Alane Suhr
We show that domain-general automatic evaluators can significantly improve the performance of agents for web navigation and device control.
1 code implementation • 29 Feb 2024 • Yifei Zhou, Andrea Zanette, Jiayi Pan, Sergey Levine, Aviral Kumar
In this paper, we develop a framework for building multi-turn RL algorithms for fine-tuning LLMs, that preserves the flexibility of existing single-turn RL methods for LLMs (e. g., proximal policy optimization), while accommodating multiple turns, long horizons, and delayed rewards effectively.
1 code implementation • 14 Nov 2023 • Yifei Zhou, Ayush Sekhari, Yuda Song, Wen Sun
In this work, we propose a new hybrid RL algorithm that combines an on-policy actor-critic method with offline data.
2 code implementations • 22 Feb 2023 • Yifei Zhou, Juntao Ren, Fengyu Li, Ramin Zabih, Ser-Nam Lim
Advances in the field of vision-language contrastive learning have made it possible for many downstream applications to be carried out efficiently and accurately by simply taking the dot product between image and text representations.
1 code implementation • ICCV 2023 • Yifei Zhou, Zilu Li, Abhinav Shrivastava, Hengshuang Zhao, Antonio Torralba, Taipeng Tian, Ser-Nam Lim
In this way, the new representation can be directly compared with the old representation, in principle avoiding the need for any backfilling.
1 code implementation • 8 Nov 2022 • Yifei Zhou, Zilu Li, Abhinav Shrivastava, Hengshuang Zhao, Antonio Torralba, Taipeng Tian, Ser-Nam Lim
In this way, the new representation can be directly compared with the old representation, in principle avoiding the need for any backfilling.
1 code implementation • 13 Oct 2022 • Yuda Song, Yifei Zhou, Ayush Sekhari, J. Andrew Bagnell, Akshay Krishnamurthy, Wen Sun
We consider a hybrid reinforcement learning setting (Hybrid RL), in which an agent has access to an offline dataset and the ability to collect experience via real-world online interaction.
1 code implementation • 10 Oct 2022 • Yujie Zhang, Qi Yang, Yifei Zhou, Xiaozhong Xu, Le Yang, Yiling Xu
The goal of objective point cloud quality assessment (PCQA) research is to develop quantitative metrics that measure point cloud quality in a perceptually consistent manner.
1 code implementation • 5 Oct 2022 • Yifei Zhou, Renyu Li, Hayden Housen, Ser-Nam Lim
Paraphrase Identification is a fundamental task in Natural Language Processing.
no code implementations • Findings (NAACL) 2022 • Yifei Zhou, Yansong Feng
Recent works show that discourse analysis benefits from modeling intra- and inter-sentential levels separately, where proper representations for text units of different granularities are desired to capture both the meaning of text units and their relations to the context.
no code implementations • 21 Apr 2022 • Shrenik Zinage, Suyash Jadhav, Yifei Zhou, Ilias Bilionis, Peter Meckl
The objective of this paper is to develop a model to predict the transient and steady-state behavior of the turbine using the Koopman operator which can be helpful for control design and analysis.