1 code implementation • 17 Apr 2024 • Xin Li, Kun Yuan, Yajing Pei, Yiting Lu, Ming Sun, Chao Zhou, Zhibo Chen, Radu Timofte, Wei Sun, HaoNing Wu, ZiCheng Zhang, Jun Jia, Zhichao Zhang, Linhan Cao, Qiubo Chen, Xiongkuo Min, Weisi Lin, Guangtao Zhai, Jianhui Sun, Tianyi Wang, Lei LI, Han Kong, Wenxuan Wang, Bing Li, Cheng Luo, Haiqiang Wang, Xiangguang Chen, Wenhui Meng, Xiang Pan, Huiying Shi, Han Zhu, Xiaozhong Xu, Lei Sun, Zhenzhong Chen, Shan Liu, Fangyuan Kong, Haotian Fan, Yifang Xu, Haoran Xu, Mengduo Yang, Jie zhou, Jiaze Li, Shijie Wen, Mai Xu, Da Li, Shunyu Yao, Jiazhi Du, WangMeng Zuo, Zhibo Li, Shuai He, Anlong Ming, Huiyuan Fu, Huadong Ma, Yong Wu, Fie Xue, Guozhi Zhao, Lina Du, Jie Guo, Yu Zhang, huimin zheng, JunHao Chen, Yue Liu, Dulan Zhou, Kele Xu, Qisheng Xu, Tao Sun, Zhixiang Ding, Yuhang Hu
This paper reviews the NTIRE 2024 Challenge on Shortform UGC Video Quality Assessment (S-UGC VQA), where various excellent solutions are submitted and evaluated on the collected dataset KVQ from popular short-form video platform, i. e., Kuaishou/Kwai Platform.
1 code implementation • 16 Apr 2024 • Quan Shi, Michael Tang, Karthik Narasimhan, Shunyu Yao
In this paper, we introduce the USACO benchmark with 307 problems from the USA Computing Olympiad, along with high-quality unit tests, reference code, and official analyses for each problem.
1 code implementation • 13 Mar 2024 • Bowen Li, Wenhan Wu, Ziwei Tang, Lin Shi, John Yang, Jinyang Li, Shunyu Yao, Chen Qian, Binyuan Hui, Qicheng Zhang, Zhiyin Yu, He Du, Ping Yang, Dahua Lin, Chao Peng, Kai Chen
Recent advancements in large language models (LLMs) have significantly enhanced their coding capabilities.
1 code implementation • 12 Feb 2024 • Zhiyong Wu, Chengcheng Han, Zichen Ding, Zhenmin Weng, Zhoumianze Liu, Shunyu Yao, Tao Yu, Lingpeng Kong
Autonomous interaction with the computer has been a longstanding challenge with great potential, and the recent proliferation of large language models (LLMs) has markedly accelerated progress in building digital agents.
1 code implementation • 19 Oct 2023 • Fei Liu, Xi Lin, Zhenkun Wang, Shunyu Yao, Xialiang Tong, Mingxuan Yuan, Qingfu Zhang
It is also promising to see the operator only learned from a few instances can have robust generalization performance on unseen problems with quite different patterns and settings.
no code implementations • 10 Oct 2023 • Carlos E. Jimenez, John Yang, Alexander Wettig, Shunyu Yao, Kexin Pei, Ofir Press, Karthik Narasimhan
We find real-world software engineering to be a rich, sustainable, and challenging testbed for evaluating the next generation of language models.
Ranked #2 on Bug fixing on SWE-bench
no code implementations • 9 Oct 2023 • Baian Chen, Chang Shu, Ehsan Shareghi, Nigel Collier, Karthik Narasimhan, Shunyu Yao
Recent efforts have augmented language models (LMs) with external tools or environments, leading to the development of language agents that can reason and act.
Ranked #7 on Question Answering on Bamboogle
no code implementations • 24 Sep 2023 • R. Thomas McCoy, Shunyu Yao, Dan Friedman, Matthew Hardy, Thomas L. Griffiths
This approach - which we call the teleological approach - leads us to identify three factors that we hypothesize will influence LLM accuracy: the probability of the task to be performed, the probability of the target output, and the probability of the provided input.
2 code implementations • 5 Sep 2023 • Theodore R. Sumers, Shunyu Yao, Karthik Narasimhan, Thomas L. Griffiths
Recent efforts have augmented large language models (LLMs) with external resources (e. g., the Internet) or internal control flows (e. g., prompt chaining) for tasks requiring grounding or reasoning, leading to a new class of language agents.
1 code implementation • 17 Jul 2023 • Shunyu Yao, Howard Chen, Austin W. Hanjie, Runzhe Yang, Karthik Narasimhan
Text generation under constraints have seen increasing interests in natural language processing, especially with the rapidly improving capabilities of large language models.
2 code implementations • NeurIPS 2023 • John Yang, Akshara Prabhakar, Karthik Narasimhan, Shunyu Yao
Our framework is language and platform agnostic, uses self-contained Docker environments to provide safe and reproducible execution, and is compatible out-of-the-box with traditional seq2seq coding methods, while enabling the development of new methods for interactive code generation.
1 code implementation • 24 May 2023 • Michael Tang, Shunyu Yao, John Yang, Karthik Narasimhan
We propose Referral-Augmented Retrieval (RAR), a simple technique that concatenates document indices with referrals, i. e. text from other documents that cite or link to the given document, to provide significant performance gains for zero-shot information retrieval.
3 code implementations • NeurIPS 2023 • Shunyu Yao, Dian Yu, Jeffrey Zhao, Izhak Shafran, Thomas L. Griffiths, Yuan Cao, Karthik Narasimhan
Language models are increasingly being deployed for general problem solving across a wide range of tasks, but are still confined to token-level, left-to-right decision-making processes during inference.
1 code implementation • 17 May 2023 • Mo Yu, Jiangnan Li, Shunyu Yao, Wenjie Pang, Xiaochen Zhou, Zhou Xiao, Fandong Meng, Jie zhou
As readers engage with a story, their understanding of a character evolves based on new events and information; and multiple fine-grained aspects of personalities can be perceived.
no code implementations • 19 Apr 2023 • Yao Mu, Shunyu Yao, Mingyu Ding, Ping Luo, Chuang Gan
We learn embodied representations of video trajectories, emergent language, and natural language using a language model, which is then used to finetune a lightweight policy network for downstream control.
2 code implementations • NeurIPS 2023 • Noah Shinn, Federico Cassano, Edward Berman, Ashwin Gopinath, Karthik Narasimhan, Shunyu Yao
Large language models (LLMs) have been increasingly used to interact with external environments (e. g., games, compilers, APIs) as goal-driven agents.
no code implementations • CVPR 2023 • Yao Mu, Shunyu Yao, Mingyu Ding, Ping Luo, Chuang Gan
We learn embodied representations of video trajectories, emergent language, and natural language using a language model, which is then used to finetune a lightweight policy network for downstream control.
no code implementations • 15 Oct 2022 • Yi Gu, Shunyu Yao, Chuang Gan, Joshua B. Tenenbaum, Mo Yu
Text games present opportunities for natural language understanding (NLU) methods to tackle reinforcement learning (RL) challenges.
5 code implementations • 6 Oct 2022 • Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, Yuan Cao
While large language models (LLMs) have demonstrated impressive capabilities across tasks in language understanding and interactive decision making, their abilities for reasoning (e. g. chain-of-thought prompting) and acting (e. g. action plan generation) have primarily been studied as separate topics.
1 code implementation • 4 Jul 2022 • Shunyu Yao, Howard Chen, John Yang, Karthik Narasimhan
Existing benchmarks for grounding language in interactive environments either lack real-world linguistic elements, or prove difficult to scale up due to substantial human involvement in the collection of data or feedback signals.
1 code implementation • NAACL 2022 • Yisi Sang, Xiangyang Mou, Mo Yu, Shunyu Yao, Jing Li, Jeffrey Stanton
We propose a new task for assessing machines' skills of understanding fictional characters in narrative stories.
1 code implementation • ICLR 2022 • Shunyu Yao, Mo Yu, Yang Zhang, Karthik R Narasimhan, Joshua B. Tenenbaum, Chuang Gan
In this work, we propose a novel way to establish such a link by corpus transfer, i. e. pretraining on a corpus of emergent language for downstream natural language tasks, which is in contrast to prior work that directly transfers speaker and listener parameters.
1 code implementation • ICLR 2022 • Jens Tuyls, Shunyu Yao, Sham Kakade, Karthik Narasimhan
Text adventure games present unique challenges to reinforcement learning methods due to their combinatorially large action spaces and sparse rewards.
no code implementations • 3 Jan 2022 • Shunyu Yao, RuiZhe Zhong, Yichao Yan, Guangtao Zhai, Xiaokang Yang
Specifically, neural radiance field takes lip movements features and personalized attributes as two disentangled conditions, where lip movements are directly predicted from the audio inputs to achieve lip-synchronized generation.
1 code implementation • CVPR 2021 • Wei Ji, Jingjing Li, Shuang Yu, Miao Zhang, Yongri Piao, Shunyu Yao, Qi Bi, Kai Ma, Yefeng Zheng, Huchuan Lu, Li Cheng
Complex backgrounds and similar appearances between objects and their surroundings are generally recognized as challenging scenarios in Salient Object Detection (SOD).
Ranked #13 on Thermal Image Segmentation on RGB-T-Glass-Segmentation
1 code implementation • ACL 2021 • Shunyu Yao, Binghui Peng, Christos Papadimitriou, Karthik Narasimhan
Despite their impressive performance in NLP, self-attention networks were recently proved to be limited for processing formal languages with hierarchical structure, such as $\mathsf{Dyck}_k$, the language consisting of well-nested parentheses of $k$ types.
no code implementations • NAACL 2021 • Shunyu Yao, Karthik Narasimhan, Matthew Hausknecht
Text-based games simulate worlds and interact with players using natural language.
1 code implementation • ICCV 2021 • Miao Zhang, Jie Liu, Yifei Wang, Yongri Piao, Shunyu Yao, Wei Ji, Jingjing Li, Huchuan Lu, Zhongxuan Luo
Our bidirectional dynamic fusion strategy encourages the interaction of spatial and temporal information in a dynamic manner.
Ranked #12 on Video Polyp Segmentation on SUN-SEG-Easy (Unseen)
1 code implementation • EMNLP 2020 • Shunyu Yao, Rohan Rao, Matthew Hausknecht, Karthik Narasimhan
In this paper, we propose the Contextual Action Language Model (CALM) to generate a compact set of action candidates at each game state.
1 code implementation • NeurIPS 2019 • Kevin Smith, Lingjie Mei, Shunyu Yao, Jiajun Wu, Elizabeth Spelke, Josh Tenenbaum, Tomer Ullman
We also present a new test set for measuring violations of physical expectations, using a range of scenarios derived from developmental psychology.
1 code implementation • NeurIPS 2018 • Shunyu Yao, Tzu Ming Harry Hsu, Jun-Yan Zhu, Jiajun Wu, Antonio Torralba, William T. Freeman, Joshua B. Tenenbaum
In this work, we propose 3D scene de-rendering networks (3D-SDN) to address the above issues by integrating disentangled representations for semantics, geometry, and appearance into a deep generative model.