Search Results for author: Tianjun Zhang

Found 32 papers, 19 papers with code

LLoCO: Learning Long Contexts Offline

2 code implementations11 Apr 2024 Sijun Tan, Xiuyu Li, Shishir Patil, Ziyang Wu, Tianjun Zhang, Kurt Keutzer, Joseph E. Gonzalez, Raluca Ada Popa

We introduce LLoCO, a technique that combines context compression, retrieval, and parameter-efficient finetuning using LoRA.

4k In-Context Learning +1

GoEX: Perspectives and Designs Towards a Runtime for Autonomous LLM Applications

1 code implementation10 Apr 2024 Shishir G. Patil, Tianjun Zhang, Vivian Fang, Noppapon C., Roy Huang, Aaron Hao, Martin Casado, Joseph E. Gonzalez, Raluca Ada Popa, Ion Stoica

We believe this is critical to unlock the potential for LLM agents to interact with applications and services with limited (post-facto) human involvement.

RAFT: Adapting Language Model to Domain Specific RAG

1 code implementation15 Mar 2024 Tianjun Zhang, Shishir G. Patil, Naman jain, Sheng Shen, Matei Zaharia, Ion Stoica, Joseph E. Gonzalez

In this paper, we present Retrieval Augmented FineTuning (RAFT), a training recipe that improves the model's ability to answer questions in a "open-book" in-domain settings.

Language Modelling

LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code

no code implementations12 Mar 2024 Naman jain, King Han, Alex Gu, Wen-Ding Li, Fanjia Yan, Tianjun Zhang, Sida Wang, Armando Solar-Lezama, Koushik Sen, Ion Stoica

Large Language Models (LLMs) applied to code-related applications have emerged as a prominent field, attracting significant interest from both academia and industry.

Code Generation

In-Context Principle Learning from Mistakes

no code implementations8 Feb 2024 Tianjun Zhang, Aman Madaan, Luyu Gao, Steven Zheng, Swaroop Mishra, Yiming Yang, Niket Tandon, Uri Alon

We evaluate LEAP on a wide range of benchmarks, including multi-hop question answering (Hotpot QA), textual QA (DROP), Big-Bench Hard reasoning, and math problems (GSM8K and MATH); in all these benchmarks, LEAP improves the strongest available LLMs such as GPT-3. 5-turbo, GPT-4, GPT-4 turbo and Claude-2. 1.

GSM8K In-Context Learning +3

Iterative Prompt Relabeling for diffusion model with RLDF

no code implementations23 Dec 2023 Jiaxin Ge, Xinyan Chen, Tianjun Zhang, Shanghang Zhang

IP-RLDF first samples a batch of images conditioned on the text, then relabels the text prompts of unmatched text-image pairs with classifier feedback.

Image Generation reinforcement-learning +2

LLM-Assisted Code Cleaning For Training Accurate Code Generators

no code implementations25 Nov 2023 Naman jain, Tianjun Zhang, Wei-Lin Chiang, Joseph E. Gonzalez, Koushik Sen, Ion Stoica

In this work, we investigate data quality for code and find that making the code more structured and readable leads to improved code generation performance of the system.

Code Generation

AgentBench: Evaluating LLMs as Agents

1 code implementation7 Aug 2023 Xiao Liu, Hao Yu, Hanchen Zhang, Yifan Xu, Xuanyu Lei, Hanyu Lai, Yu Gu, Hangliang Ding, Kaiwen Men, Kejuan Yang, Shudan Zhang, Xiang Deng, Aohan Zeng, Zhengxiao Du, Chenhui Zhang, Sheng Shen, Tianjun Zhang, Yu Su, Huan Sun, Minlie Huang, Yuxiao Dong, Jie Tang

We present AgentBench, a multi-dimensional evolving benchmark that currently consists of 8 distinct environments to assess LLM-as-Agent's reasoning and decision-making abilities in a multi-turn open-ended generation setting.

Decision Making Instruction Following

Controllable Text-to-Image Generation with GPT-4

no code implementations29 May 2023 Tianjun Zhang, Yi Zhang, Vibhav Vineet, Neel Joshi, Xin Wang

Control-GPT works by querying GPT-4 to write TikZ code, and the generated sketches are used as references alongside the text instructions for diffusion models (e. g., ControlNet) to generate photo-realistic images.

Instruction Following Text-to-Image Generation

Gorilla: Large Language Model Connected with Massive APIs

1 code implementation24 May 2023 Shishir G. Patil, Tianjun Zhang, Xin Wang, Joseph E. Gonzalez

Large Language Models (LLMs) have seen an impressive wave of advances recently, with models now excelling in a variety of tasks, such as mathematical reasoning and program synthesis.

Hallucination Language Modelling +4

The Wisdom of Hindsight Makes Language Models Better Instruction Followers

1 code implementation10 Feb 2023 Tianjun Zhang, Fangchen Liu, Justin Wong, Pieter Abbeel, Joseph E. Gonzalez

In this paper, we consider an alternative approach: converting feedback to instruction by relabeling the original one and training the model for better alignment in a supervised manner.

Decision Making Language Modelling +2

Multitask Vision-Language Prompt Tuning

1 code implementation21 Nov 2022 Sheng Shen, Shijia Yang, Tianjun Zhang, Bohan Zhai, Joseph E. Gonzalez, Kurt Keutzer, Trevor Darrell

Specifically, (i) we demonstrate the effectiveness of learning a single transferable prompt from multiple source tasks to initialize the prompt for each target task; (ii) we show many target tasks can benefit each other from sharing prompt vectors and thus can be jointly learned via multitask prompt tuning.

Visual Prompt Tuning

TEMPERA: Test-Time Prompting via Reinforcement Learning

1 code implementation21 Nov 2022 Tianjun Zhang, Xuezhi Wang, Denny Zhou, Dale Schuurmans, Joseph E. Gonzalez

To achieve this, we design a novel action space that allows flexible editing of the initial prompts covering a wide set of commonly-used components like instructions, few-shot exemplars, and verbalizers.

Few-Shot Learning Natural Language Inference +5

Efficient Planning in a Compact Latent Action Space

1 code implementation22 Aug 2022 Zhengyao Jiang, Tianjun Zhang, Michael Janner, Yueying Li, Tim Rocktäschel, Edward Grefenstette, Yuandong Tian

Planning-based reinforcement learning has shown strong performance in tasks in discrete and low-dimensional continuous action spaces.

Continuous Control Decision Making +2

Making Linear MDPs Practical via Contrastive Representation Learning

no code implementations14 Jul 2022 Tianjun Zhang, Tongzheng Ren, Mengjiao Yang, Joseph E. Gonzalez, Dale Schuurmans, Bo Dai

It is common to address the curse of dimensionality in Markov decision processes (MDPs) by exploiting low-rank representations.

Representation Learning

Contrastive Learning as Goal-Conditioned Reinforcement Learning

no code implementations15 Jun 2022 Benjamin Eysenbach, Tianjun Zhang, Ruslan Salakhutdinov, Sergey Levine

While deep RL should automatically acquire such good representations, prior work often finds that learning representations in an end-to-end fashion is unstable and instead equip RL algorithms with additional representation learning parts (e. g., auxiliary losses, data augmentation).

Contrastive Learning Data Augmentation +4

Graph Backup: Data Efficient Backup Exploiting Markovian Transitions

1 code implementation31 May 2022 Zhengyao Jiang, Tianjun Zhang, Robert Kirk, Tim Rocktäschel, Edward Grefenstette

In this paper, we treat the transition data of the MDP as a graph, and define a novel backup operator, Graph Backup, which exploits this graph structure for better value estimation.

Atari Games counterfactual +2

NovelD: A Simple yet Effective Exploration Criterion

1 code implementation NeurIPS 2021 Tianjun Zhang, Huazhe Xu, Xiaolong Wang, Yi Wu, Kurt Keutzer, Joseph E. Gonzalez, Yuandong Tian

We analyze NovelD thoroughly in MiniGrid and found that empirically it helps the agent explore the environment more uniformly with a focus on exploring beyond the boundary.

Efficient Exploration Montezuma's Revenge +1

A Free Lunch from the Noise: Provable and Practical Exploration for Representation Learning

no code implementations22 Nov 2021 Tongzheng Ren, Tianjun Zhang, Csaba Szepesvári, Bo Dai

Representation learning lies at the heart of the empirical success of deep learning for dealing with the curse of dimensionality.

Reinforcement Learning (RL) Representation Learning

C-Planning: An Automatic Curriculum for Learning Goal-Reaching Tasks

no code implementations ICLR 2022 Tianjun Zhang, Benjamin Eysenbach, Ruslan Salakhutdinov, Sergey Levine, Joseph E. Gonzalez

Goal-conditioned reinforcement learning (RL) can solve tasks in a wide range of domains, including navigation and manipulation, but learning to reach distant goals remains a central challenge to the field.

Reinforcement Learning (RL)

Multi-objective Optimization by Learning Space Partitions

1 code implementation7 Oct 2021 Yiyang Zhao, Linnan Wang, Kevin Yang, Tianjun Zhang, Tian Guo, Yuandong Tian

In this paper, we propose LaMOO, a novel multi-objective optimizer that learns a model from observed samples to partition the search space and then focus on promising regions that are likely to contain a subset of the Pareto frontier.

Neural Architecture Search

Multi-objective Optimization by Learning Space Partition

no code implementations ICLR 2022 Yiyang Zhao, Linnan Wang, Kevin Yang, Tianjun Zhang, Tian Guo, Yuandong Tian

In this paper, we propose LaMOO, a novel multi-objective optimizer that learns a model from observed samples to partition the search space and then focus on promising regions that are likely to contain a subset of the Pareto frontier.

Neural Architecture Search

Learning Space Partitions for Path Planning

2 code implementations NeurIPS 2021 Kevin Yang, Tianjun Zhang, Chris Cummins, Brandon Cui, Benoit Steiner, Linnan Wang, Joseph E. Gonzalez, Dan Klein, Yuandong Tian

Path planning, the problem of efficiently discovering high-reward trajectories, often requires optimizing a high-dimensional and multimodal reward function.

MADE: Exploration via Maximizing Deviation from Explored Regions

1 code implementation NeurIPS 2021 Tianjun Zhang, Paria Rashidinejad, Jiantao Jiao, Yuandong Tian, Joseph Gonzalez, Stuart Russell

As a proof of concept, we evaluate the new intrinsic reward on tabular examples across a variety of model-based and model-free algorithms, showing improvements over count-only exploration strategies.

Efficient Exploration Reinforcement Learning (RL)

BeBold: Exploration Beyond the Boundary of Explored Regions

2 code implementations15 Dec 2020 Tianjun Zhang, Huazhe Xu, Xiaolong Wang, Yi Wu, Kurt Keutzer, Joseph E. Gonzalez, Yuandong Tian

In this paper, we analyze the pros and cons of each method and propose the regulated difference of inverse visitation counts as a simple but effective criterion for IR.

Efficient Exploration NetHack

Multi-Agent Collaboration via Reward Attribution Decomposition

2 code implementations16 Oct 2020 Tianjun Zhang, Huazhe Xu, Xiaolong Wang, Yi Wu, Kurt Keutzer, Joseph E. Gonzalez, Yuandong Tian

In this work, we propose Collaborative Q-learning (CollaQ) that achieves state-of-the-art performance in the StarCraft multi-agent challenge and supports ad hoc team play.

Dota 2 Multi-agent Reinforcement Learning +2

ANODEV2: A Coupled Neural ODE Framework

1 code implementation NeurIPS 2019 Tianjun Zhang, Zhewei Yao, Amir Gholami, Joseph E. Gonzalez, Kurt Keutzer, Michael W. Mahoney, George Biros

It has been observed that residual networks can be viewed as the explicit Euler discretization of an Ordinary Differential Equation (ODE).

ANODEV2: A Coupled Neural ODE Evolution Framework

no code implementations10 Jun 2019 Tianjun Zhang, Zhewei Yao, Amir Gholami, Kurt Keutzer, Joseph Gonzalez, George Biros, Michael Mahoney

It has been observed that residual networks can be viewed as the explicit Euler discretization of an Ordinary Differential Equation (ODE).

Synetgy: Algorithm-hardware Co-design for ConvNet Accelerators on Embedded FPGAs

1 code implementation21 Nov 2018 Yifan Yang, Qijing Huang, Bichen Wu, Tianjun Zhang, Liang Ma, Giulio Gambardella, Michaela Blott, Luciano Lavagno, Kees Vissers, John Wawrzynek, Kurt Keutzer

DiracDeltaNet achieves competitive accuracy on ImageNet (88. 7\% top-5), but with 42$\times$ fewer parameters and 48$\times$ fewer OPs than VGG16.

Cannot find the paper you are looking for? You can Submit a new open access paper.