Search Results for author: Tianjun Zhang

Found 32 papers, 19 papers with code

LLoCO: Learning Long Contexts Offline

2 code implementations • 11 Apr 2024 • Sijun Tan, Xiuyu Li, Shishir Patil, Ziyang Wu, Tianjun Zhang, Kurt Keutzer, Joseph E. Gonzalez, Raluca Ada Popa

We introduce LLoCO, a technique that combines context compression, retrieval, and parameter-efficient finetuning using LoRA.

4k In-Context Learning +1

205

Paper
Code

GoEX: Perspectives and Designs Towards a Runtime for Autonomous LLM Applications

1 code implementation • 10 Apr 2024 • Shishir G. Patil, Tianjun Zhang, Vivian Fang, Noppapon C., Roy Huang, Aaron Hao, Martin Casado, Joseph E. Gonzalez, Raluca Ada Popa, Ion Stoica

We believe this is critical to unlock the potential for LLM agents to interact with applications and services with limited (post-facto) human involvement.

10,186

Paper
Code

RAFT: Adapting Language Model to Domain Specific RAG

1 code implementation • 15 Mar 2024 • Tianjun Zhang, Shishir G. Patil, Naman jain, Sheng Shen, Matei Zaharia, Ion Stoica, Joseph E. Gonzalez

In this paper, we present Retrieval Augmented FineTuning (RAFT), a training recipe that improves the model's ability to answer questions in a "open-book" in-domain settings.

Language Modelling

10,186

Paper
Code

LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code

no code implementations • 12 Mar 2024 • Naman jain, King Han, Alex Gu, Wen-Ding Li, Fanjia Yan, Tianjun Zhang, Sida Wang, Armando Solar-Lezama, Koushik Sen, Ion Stoica

Large Language Models (LLMs) applied to code-related applications have emerged as a prominent field, attracting significant interest from both academia and industry.

Code Generation

Paper
Add Code

In-Context Principle Learning from Mistakes

no code implementations • 8 Feb 2024 • Tianjun Zhang, Aman Madaan, Luyu Gao, Steven Zheng, Swaroop Mishra, Yiming Yang, Niket Tandon, Uri Alon

We evaluate LEAP on a wide range of benchmarks, including multi-hop question answering (Hotpot QA), textual QA (DROP), Big-Bench Hard reasoning, and math problems (GSM8K and MATH); in all these benchmarks, LEAP improves the strongest available LLMs such as GPT-3. 5-turbo, GPT-4, GPT-4 turbo and Claude-2. 1.

GSM8K In-Context Learning +3

Paper
Add Code

Iterative Prompt Relabeling for diffusion model with RLDF

no code implementations • 23 Dec 2023 • Jiaxin Ge, Xinyan Chen, Tianjun Zhang, Shanghang Zhang

IP-RLDF first samples a batch of images conditioned on the text, then relabels the text prompts of unmatched text-image pairs with classifier feedback.

Image Generation reinforcement-learning +2

Paper
Add Code

LLM-Assisted Code Cleaning For Training Accurate Code Generators

no code implementations • 25 Nov 2023 • Naman jain, Tianjun Zhang, Wei-Lin Chiang, Joseph E. Gonzalez, Koushik Sen, Ion Stoica

In this work, we investigate data quality for code and find that making the code more structured and readable leads to improved code generation performance of the system.

Code Generation

Paper
Add Code

AgentBench: Evaluating LLMs as Agents

1 code implementation • 7 Aug 2023 • Xiao Liu, Hao Yu, Hanchen Zhang, Yifan Xu, Xuanyu Lei, Hanyu Lai, Yu Gu, Hangliang Ding, Kaiwen Men, Kejuan Yang, Shudan Zhang, Xiang Deng, Aohan Zeng, Zhengxiao Du, Chenhui Zhang, Sheng Shen, Tianjun Zhang, Yu Su, Huan Sun, Minlie Huang, Yuxiao Dong, Jie Tang

We present AgentBench, a multi-dimensional evolving benchmark that currently consists of 8 distinct environments to assess LLM-as-Agent's reasoning and decision-making abilities in a multi-turn open-ended generation setting.

Decision Making Instruction Following

1,862

Paper
Code

Controllable Text-to-Image Generation with GPT-4

no code implementations • 29 May 2023 • Tianjun Zhang, Yi Zhang, Vibhav Vineet, Neel Joshi, Xin Wang

Control-GPT works by querying GPT-4 to write TikZ code, and the generated sketches are used as references alongside the text instructions for diffusion models (e. g., ControlNet) to generate photo-realistic images.

Instruction Following Text-to-Image Generation

Paper
Add Code

Gorilla: Large Language Model Connected with Massive APIs

1 code implementation • 24 May 2023 • Shishir G. Patil, Tianjun Zhang, Xin Wang, Joseph E. Gonzalez

Large Language Models (LLMs) have seen an impressive wave of advances recently, with models now excelling in a variety of tasks, such as mathematical reasoning and program synthesis.

Hallucination Language Modelling +4

10,186

Paper
Code

The Wisdom of Hindsight Makes Language Models Better Instruction Followers

1 code implementation • 10 Feb 2023 • Tianjun Zhang, Fangchen Liu, Justin Wong, Pieter Abbeel, Joseph E. Gonzalez

In this paper, we consider an alternative approach: converting feedback to instruction by relabeling the original one and training the model for better alignment in a supervised manner.

Decision Making Language Modelling +2

155

Paper
Code

Latent Variable Representation for Reinforcement Learning

no code implementations • 17 Dec 2022 • Tongzheng Ren, Chenjun Xiao, Tianjun Zhang, Na Li, Zhaoran Wang, Sujay Sanghavi, Dale Schuurmans, Bo Dai

Theoretically, we establish the sample complexity of the proposed approach in the online and offline settings.

Model-based Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Multitask Vision-Language Prompt Tuning

1 code implementation • 21 Nov 2022 • Sheng Shen, Shijia Yang, Tianjun Zhang, Bohan Zhai, Joseph E. Gonzalez, Kurt Keutzer, Trevor Darrell

Specifically, (i) we demonstrate the effectiveness of learning a single transferable prompt from multiple source tasks to initialize the prompt for each target task; (ii) we show many target tasks can benefit each other from sharing prompt vectors and thus can be jointly learned via multitask prompt tuning.

Visual Prompt Tuning

Paper
Code

TEMPERA: Test-Time Prompting via Reinforcement Learning

1 code implementation • 21 Nov 2022 • Tianjun Zhang, Xuezhi Wang, Denny Zhou, Dale Schuurmans, Joseph E. Gonzalez

To achieve this, we design a novel action space that allows flexible editing of the initial prompts covering a wide set of commonly-used components like instructions, few-shot exemplars, and verbalizers.

Few-Shot Learning Natural Language Inference +5

Paper
Code

Efficient Planning in a Compact Latent Action Space

1 code implementation • 22 Aug 2022 • Zhengyao Jiang, Tianjun Zhang, Michael Janner, Yueying Li, Tim Rocktäschel, Edward Grefenstette, Yuandong Tian

Planning-based reinforcement learning has shown strong performance in tasks in discrete and low-dimensional continuous action spaces.

Continuous Control Decision Making +2

Paper
Code

Spectral Decomposition Representation for Reinforcement Learning

no code implementations • 19 Aug 2022 • Tongzheng Ren, Tianjun Zhang, Lisa Lee, Joseph E. Gonzalez, Dale Schuurmans, Bo Dai

Representation learning often plays a critical role in reinforcement learning by managing the curse of dimensionality.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

Making Linear MDPs Practical via Contrastive Representation Learning

no code implementations • 14 Jul 2022 • Tianjun Zhang, Tongzheng Ren, Mengjiao Yang, Joseph E. Gonzalez, Dale Schuurmans, Bo Dai

It is common to address the curse of dimensionality in Markov decision processes (MDPs) by exploiting low-rank representations.

Representation Learning

Paper
Add Code

Contrastive Learning as Goal-Conditioned Reinforcement Learning

no code implementations • 15 Jun 2022 • Benjamin Eysenbach, Tianjun Zhang, Ruslan Salakhutdinov, Sergey Levine

While deep RL should automatically acquire such good representations, prior work often finds that learning representations in an end-to-end fashion is unstable and instead equip RL algorithms with additional representation learning parts (e. g., auxiliary losses, data augmentation).

Contrastive Learning Data Augmentation +4

Paper
Add Code

Graph Backup: Data Efficient Backup Exploiting Markovian Transitions

1 code implementation • 31 May 2022 • Zhengyao Jiang, Tianjun Zhang, Robert Kirk, Tim Rocktäschel, Edward Grefenstette

In this paper, we treat the transition data of the MDP as a graph, and define a novel backup operator, Graph Backup, which exploits this graph structure for better value estimation.

Atari Games counterfactual +2

Paper
Code

NovelD: A Simple yet Effective Exploration Criterion

1 code implementation • NeurIPS 2021 • Tianjun Zhang, Huazhe Xu, Xiaolong Wang, Yi Wu, Kurt Keutzer, Joseph E. Gonzalez, Yuandong Tian

We analyze NovelD thoroughly in MiniGrid and found that empirically it helps the agent explore the environment more uniformly with a focus on exploring beyond the boundary.

Efficient Exploration Montezuma's Revenge +1

Paper
Code

A Free Lunch from the Noise: Provable and Practical Exploration for Representation Learning

no code implementations • 22 Nov 2021 • Tongzheng Ren, Tianjun Zhang, Csaba Szepesvári, Bo Dai

Representation learning lies at the heart of the empirical success of deep learning for dealing with the curse of dimensionality.

Reinforcement Learning (RL) Representation Learning

Paper
Add Code

C-Planning: An Automatic Curriculum for Learning Goal-Reaching Tasks

no code implementations • ICLR 2022 • Tianjun Zhang, Benjamin Eysenbach, Ruslan Salakhutdinov, Sergey Levine, Joseph E. Gonzalez

Goal-conditioned reinforcement learning (RL) can solve tasks in a wide range of domains, including navigation and manipulation, but learning to reach distant goals remains a central challenge to the field.

Reinforcement Learning (RL)

Paper
Add Code

Multi-objective Optimization by Learning Space Partitions

1 code implementation • 7 Oct 2021 • Yiyang Zhao, Linnan Wang, Kevin Yang, Tianjun Zhang, Tian Guo, Yuandong Tian

In this paper, we propose LaMOO, a novel multi-objective optimizer that learns a model from observed samples to partition the search space and then focus on promising regions that are likely to contain a subset of the Pareto frontier.

Neural Architecture Search

Paper
Code

Multi-objective Optimization by Learning Space Partition

no code implementations • ICLR 2022 • Yiyang Zhao, Linnan Wang, Kevin Yang, Tianjun Zhang, Tian Guo, Yuandong Tian

Neural Architecture Search

Paper
Add Code

Learning Space Partitions for Path Planning

2 code implementations • NeurIPS 2021 • Kevin Yang, Tianjun Zhang, Chris Cummins, Brandon Cui, Benoit Steiner, Linnan Wang, Joseph E. Gonzalez, Dan Klein, Yuandong Tian

Path planning, the problem of efficiently discovering high-reward trajectories, often requires optimizing a high-dimensional and multimodal reward function.

Paper
Code

MADE: Exploration via Maximizing Deviation from Explored Regions

1 code implementation • NeurIPS 2021 • Tianjun Zhang, Paria Rashidinejad, Jiantao Jiao, Yuandong Tian, Joseph Gonzalez, Stuart Russell

As a proof of concept, we evaluate the new intrinsic reward on tabular examples across a variety of model-based and model-free algorithms, showing improvements over count-only exploration strategies.

Efficient Exploration Reinforcement Learning (RL)

Paper
Code

BeBold: Exploration Beyond the Boundary of Explored Regions

2 code implementations • 15 Dec 2020 • Tianjun Zhang, Huazhe Xu, Xiaolong Wang, Yi Wu, Kurt Keutzer, Joseph E. Gonzalez, Yuandong Tian

In this paper, we analyze the pros and cons of each method and propose the regulated difference of inverse visitation counts as a simple but effective criterion for IR.

Efficient Exploration NetHack

932

Paper
Code

Multi-Agent Collaboration via Reward Attribution Decomposition

2 code implementations • 16 Oct 2020 • Tianjun Zhang, Huazhe Xu, Xiaolong Wang, Yi Wu, Kurt Keutzer, Joseph E. Gonzalez, Yuandong Tian

In this work, we propose Collaborative Q-learning (CollaQ) that achieves state-of-the-art performance in the StarCraft multi-agent challenge and supports ad hoc team play.

Dota 2 Multi-agent Reinforcement Learning +2

2,597

Paper
Code

Contrastive Code Representation Learning

1 code implementation • EMNLP 2021 • Paras Jain, Ajay Jain, Tianjun Zhang, Pieter Abbeel, Joseph E. Gonzalez, Ion Stoica

Recent work learns contextual representations of source code by reconstructing tokens from their context.

Ranked #1 on Method name prediction on CodeSearchNet

Clone Detection Contrastive Learning +4

165

Paper
Code

ANODEV2: A Coupled Neural ODE Framework

1 code implementation • NeurIPS 2019 • Tianjun Zhang, Zhewei Yao, Amir Gholami, Joseph E. Gonzalez, Kurt Keutzer, Michael W. Mahoney, George Biros

It has been observed that residual networks can be viewed as the explicit Euler discretization of an Ordinary Differential Equation (ODE).

Paper
Code

ANODEV2: A Coupled Neural ODE Evolution Framework

no code implementations • 10 Jun 2019 • Tianjun Zhang, Zhewei Yao, Amir Gholami, Kurt Keutzer, Joseph Gonzalez, George Biros, Michael Mahoney

It has been observed that residual networks can be viewed as the explicit Euler discretization of an Ordinary Differential Equation (ODE).

Paper
Add Code

Synetgy: Algorithm-hardware Co-design for ConvNet Accelerators on Embedded FPGAs

1 code implementation • 21 Nov 2018 • Yifan Yang, Qijing Huang, Bichen Wu, Tianjun Zhang, Liang Ma, Giulio Gambardella, Michaela Blott, Luciano Lavagno, Kees Vissers, John Wawrzynek, Kurt Keutzer

DiracDeltaNet achieves competitive accuracy on ImageNet (88. 7\% top-5), but with 42$\times$ fewer parameters and 48$\times$ fewer OPs than VGG16.

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.