Search Results for author: Qinqing Zheng

Found 16 papers, 10 papers with code

Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping

1 code implementation • 21 Feb 2024 • Lucas Lehnert, Sainbayar Sukhbaatar, DiJia Su, Qinqing Zheng, Paul McVay, Michael Rabbat, Yuandong Tian

We fine tune this model to obtain a Searchformer, a Transformer model that optimally solves previously unseen Sokoban puzzles 93. 7% of the time, while using up to 26. 8% fewer search steps than the $A^*$ implementation that was used for training initially.

Decision Making Decoder

254

Paper
Code

Diffusion World Model

no code implementations • 5 Feb 2024 • Zihan Ding, Amy Zhang, Yuandong Tian, Qinqing Zheng

We introduce Diffusion World Model (DWM), a conditional diffusion model capable of predicting multistep future states and rewards concurrently.

D4RL Q-Learning

Paper
Add Code

Guided Flows for Generative Modeling and Decision Making

no code implementations • 22 Nov 2023 • Qinqing Zheng, Matt Le, Neta Shaul, Yaron Lipman, Aditya Grover, Ricky T. Q. Chen

Classifier-free guidance is a key component for enhancing the performance of conditional generative models across diverse tasks.

Conditional Image Generation Decision Making +3

Paper
Add Code

Dual RL: Unification and New Methods for Reinforcement and Imitation Learning

1 code implementation • 16 Feb 2023 • Harshit Sikchi, Qinqing Zheng, Amy Zhang, Scott Niekum

For offline RL, our analysis frames a recent offline RL method XQL in the dual framework, and we further propose a new method f-DVL that provides alternative choices to the Gumbel regression loss that fixes the known training instability issue of XQL.

Imitation Learning Offline RL +2

Paper
Code

Semi-Supervised Offline Reinforcement Learning with Action-Free Trajectories

1 code implementation • 12 Oct 2022 • Qinqing Zheng, Mikael Henaff, Brandon Amos, Aditya Grover

For this setting, we develop and study a simple meta-algorithmic pipeline that learns an inverse dynamics model on the labelled data to obtain proxy-labels for the unlabelled data, followed by the use of any offline RL algorithm on the true and proxy-labelled trajectories.

D4RL Offline RL +2

Paper
Code

Reliable Conditioning of Behavioral Cloning for Offline Reinforcement Learning

1 code implementation • 11 Oct 2022 • Tung Nguyen, Qinqing Zheng, Aditya Grover

We study CWBC in the context of RvS (Emmons et al., 2021) and Decision Transformers (Chen et al., 2021), and show that CWBC significantly boosts their performance on various benchmarks.

Offline RL reinforcement-learning +1

Paper
Code

Latent State Marginalization as a Low-cost Approach for Improving Exploration

1 code implementation • 3 Oct 2022 • Dinghuai Zhang, Aaron Courville, Yoshua Bengio, Qinqing Zheng, Amy Zhang, Ricky T. Q. Chen

While the maximum entropy (MaxEnt) reinforcement learning (RL) framework -- often touted for its exploration and robustness capabilities -- is usually motivated from a probabilistic perspective, the use of deep probabilistic models has not gained much traction in practice due to their inherent complexity.

Continuous Control Reinforcement Learning (RL) +1

Paper
Code

Online Decision Transformer

2 code implementations • 11 Feb 2022 • Qinqing Zheng, Amy Zhang, Aditya Grover

Recent work has shown that offline reinforcement learning (RL) can be formulated as a sequence modeling problem (Chen et al., 2021; Janner et al., 2021) and solved via approaches similar to large-scale language modeling.

D4RL Efficient Exploration +2

212

Paper
Code

A Theorem of the Alternative for Personalized Federated Learning

no code implementations • 2 Mar 2021 • Shuxiao Chen, Qinqing Zheng, Qi Long, Weijie J. Su

A widely recognized difficulty in federated learning arises from the statistical heterogeneity among clients: local datasets often come from different but not entirely unrelated distributions, and personalization is, therefore, necessary to achieve optimal results from each individual's perspective.

Personalized Federated Learning

Paper
Add Code

Federated $f$-Differential Privacy

1 code implementation • 22 Feb 2021 • Qinqing Zheng, Shuxiao Chen, Qi Long, Weijie J. Su

Federated learning (FL) is a training paradigm where the clients collaboratively learn models by repeatedly sharing information without compromising much on the privacy of their local sensitive data.

Federated Learning

Paper
Code

Near-Optimal Confidence Sequences for Bounded Random Variables

1 code implementation • 9 Jun 2020 • Arun Kumar Kuchibhotla, Qinqing Zheng

Many inference problems, such as sequential decision problems like A/B testing, adaptive sampling schemes like bandit selection, are often online in nature.

valid

Paper
Code

Sharp Composition Bounds for Gaussian Differential Privacy via Edgeworth Expansion

1 code implementation • ICML 2020 • Qinqing Zheng, Jinshuo Dong, Qi Long, Weijie J. Su

To address this question, we introduce a family of analytical and sharp privacy bounds under composition using the Edgeworth expansion in the framework of the recently proposed f-differential privacy.

Paper
Code

ShadowSync: Performing Synchronization in the Background for Highly Scalable Distributed Training

no code implementations • 7 Mar 2020 • Qinqing Zheng, Bor-Yiing Su, Jiyan Yang, Alisson Azzolini, Qiang Wu, Ou Jin, Shri Karandikar, Hagay Lupesko, Liang Xiong, Eric Zhou

Recommendation systems are often trained with a tremendous amount of data, and distributed training is the workhorse to shorten the training time.

Click-Through Rate Prediction Recommendation Systems

Paper
Add Code

Convergence Analysis for Rectangular Matrix Completion Using Burer-Monteiro Factorization and Gradient Descent

no code implementations • 23 May 2016 • Qinqing Zheng, John Lafferty

We address the rectangular matrix completion problem by lifting the unknown matrix to a positive semidefinite matrix in higher dimension, and optimizing a nonconvex objective over the semidefinite factor using a simple gradient descent scheme.

Matrix Completion

Paper
Add Code

A Convergent Gradient Descent Algorithm for Rank Minimization and Semidefinite Programming from Random Linear Measurements

no code implementations • NeurIPS 2015 • Qinqing Zheng, John Lafferty

We propose a simple, scalable, and fast gradient descent algorithm to optimize a nonconvex objective for the rank minimization problem and a closely related family of semidefinite programs.

Paper
Add Code

Interpolating Convex and Non-Convex Tensor Decompositions via the Subspace Norm

1 code implementation • NeurIPS 2015 • Qinqing Zheng, Ryota Tomioka

We consider the problem of recovering a low-rank tensor from its noisy observation.

Denoising

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.