Search Results for author: Thinh T. Doan

Found 20 papers, 1 papers with code

Fast Two-Time-Scale Stochastic Gradient Method with Applications in Reinforcement Learning

no code implementations • 15 May 2024 • Sihan Zeng, Thinh T. Doan

Two-time-scale optimization is a framework introduced in Zeng et al. (2024) that abstracts a range of policy evaluation and policy optimization problems in reinforcement learning (RL).

Reinforcement Learning (RL)

Paper
Add Code

Natural Policy Gradient and Actor Critic Methods for Constrained Multi-Task Reinforcement Learning

no code implementations • 3 May 2024 • Sihan Zeng, Thinh T. Doan, Justin Romberg

Multi-task reinforcement learning (RL) aims to find a single policy that effectively solves multiple tasks at the same time.

Reinforcement Learning (RL)

Paper
Add Code

Fast Nonlinear Two-Time-Scale Stochastic Approximation: Achieving $O(1/k)$ Finite-Sample Complexity

no code implementations • 23 Jan 2024 • Thinh T. Doan

This paper proposes to develop a new variant of the two-time-scale stochastic approximation to find the roots of two coupled nonlinear operators, assuming only noisy samples of these operators can be observed.

Paper
Add Code

Convergence and Price of Anarchy Guarantees of the Softmax Policy Gradient in Markov Potential Games

no code implementations • 15 Jun 2022 • Dingyang Chen, Qi Zhang, Thinh T. Doan

Our focus in this paper is to study the convergence of the policy gradient method for solving MPGs under softmax policy parameterization, both tabular and parameterized with general function approximators such as neural networks.

Policy Gradient Methods

Paper
Add Code

Regularized Gradient Descent Ascent for Two-Player Zero-Sum Markov Games

no code implementations • 27 May 2022 • Sihan Zeng, Thinh T. Doan, Justin Romberg

We study the problem of finding the Nash equilibrium in a two-player zero-sum Markov game.

Vocal Bursts Valence Prediction

Paper
Add Code

Convergence Rates of Two-Time-Scale Gradient Descent-Ascent Dynamics for Solving Nonconvex Min-Max Problems

no code implementations • 17 Dec 2021 • Thinh T. Doan

Perhaps, the most popular first-order method in solving min-max optimization is the so-called simultaneous (or single-loop) gradient descent-ascent algorithm due to its simplicity in implementation.

Distributed Optimization

Paper
Add Code

Finite-Time Complexity of Online Primal-Dual Natural Actor-Critic Algorithm for Constrained Markov Decision Processes

no code implementations • 21 Oct 2021 • Sihan Zeng, Thinh T. Doan, Justin Romberg

To solve this constrained optimization program, we study an online actor-critic variant of a classic primal-dual method where the gradients of both the primal and dual functions are estimated using samples from a single trajectory generated by the underlying time-varying Markov processes.

Paper
Add Code

A Two-Time-Scale Stochastic Optimization Framework with Applications in Control and Reinforcement Learning

no code implementations • 29 Sep 2021 • Sihan Zeng, Thinh T. Doan, Justin Romberg

In our two-time-scale approach, one scale is to estimate the true gradient from these samples, which is then used to update the estimate of the optimal solution.

Reinforcement Learning (RL) Stochastic Optimization

Paper
Add Code

Byzantine Fault-Tolerance in Federated Local SGD under 2f-Redundancy

no code implementations • 26 Aug 2021 • Nirupam Gupta, Thinh T. Doan, Nitin Vaidya

However, we do not know of any such techniques for the federated local SGD algorithm - a more commonly used method for federated machine learning.

Paper
Add Code

Improved Convergence Rate for a Distributed Two-Time-Scale Gradient Method under Random Quantization

no code implementations • 28 May 2021 • Marcos M. Vasconcelos, Thinh T. Doan, Urbashi Mitra

In particular, we show that the method converges at a rate $O(log_2 k/\sqrt k)$ to the optimal solution, when the underlying objective function is strongly convex and smooth.

Quantization

Paper
Add Code

Finite-Time Convergence Rates of Nonlinear Two-Time-Scale Stochastic Approximation under Markovian Noise

no code implementations • 4 Apr 2021 • Thinh T. Doan

Such dependent data result to biased observations of the underlying operators.

Paper
Add Code

Finite Sample Analysis of Two-Time-Scale Natural Actor-Critic Algorithm

no code implementations • 26 Jan 2021 • Sajad Khodadadian, Thinh T. Doan, Justin Romberg, Siva Theja Maguluri

In this paper, we characterize the \emph{global} convergence of an online natural actor-critic algorithm in the tabular setting using a single trajectory of samples.

Vocal Bursts Valence Prediction

Paper
Add Code

Nonlinear Two-Time-Scale Stochastic Approximation: Convergence and Finite-Time Performance

no code implementations • 3 Nov 2020 • Thinh T. Doan

Under some fairly standard assumptions, we provide a formula that characterizes the rate of convergence of the main iterates to the desired solutions.

Vocal Bursts Valence Prediction

Paper
Add Code

Finite-Time Convergence Rates of Decentralized Stochastic Approximation with Applications in Multi-Agent and Multi-Task Learning

no code implementations • 28 Oct 2020 • Sihan Zeng, Thinh T. Doan, Justin Romberg

We study a decentralized variant of stochastic approximation, a data-driven approach for finding the root of an operator under noisy measurements.

Multi-Task Learning Q-Learning +1

Paper
Add Code

Local Stochastic Approximation: A Unified View of Federated Learning and Distributed Multi-Task Reinforcement Learning Algorithms

no code implementations • 24 Jun 2020 • Thinh T. Doan

Motivated by broad applications in reinforcement learning and federated learning, we study local stochastic approximation over a network of agents, where their goal is to find the root of an operator composed of the local operators at the agents.

Federated Learning reinforcement-learning +1

Paper
Add Code

Finite-Time Analysis of Stochastic Gradient Descent under Markov Randomness

no code implementations • 24 Mar 2020 • Thinh T. Doan, Lam M. Nguyen, Nhan H. Pham, Justin Romberg

Motivated by broad applications in reinforcement learning and machine learning, this paper considers the popular stochastic gradient descent (SGD) when the gradients of the underlying objective function are sampled from Markov processes.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Finite-Time Analysis and Restarting Scheme for Linear Two-Time-Scale Stochastic Approximation

no code implementations • 23 Dec 2019 • Thinh T. Doan

Motivated by their broad applications in reinforcement learning, we study the linear two-time-scale stochastic approximation, an iterative method using two different step sizes for finding the solutions of a system of two equations.

Paper
Add Code

Finite-Time Performance of Distributed Temporal Difference Learning with Linear Function Approximation

no code implementations • 25 Jul 2019 • Thinh T. Doan, Siva Theja Maguluri, Justin Romberg

Our main contribution is to provide a finite-analysis on the performance of this distributed {\sf TD}$(\lambda)$ algorithm for both constant and time-varying step sizes.

Multi-agent Reinforcement Learning

Paper
Add Code

Finite-Sample Analysis of Nonlinear Stochastic Approximation with Applications in Reinforcement Learning

1 code implementation • 27 May 2019 • Zaiwei Chen, Sheng Zhang, Thinh T. Doan, John-Paul Clarke, Siva Theja Maguluri

To demonstrate the generality of our theoretical results on Markovian SA, we use it to derive the finite-sample bounds of the popular $Q$-learning with linear function approximation algorithm, under a condition on the behavior policy.

Q-Learning reinforcement-learning +1

Paper
Code

Finite-Time Analysis of Distributed TD(0) with Linear Function Approximation for Multi-Agent Reinforcement Learning

no code implementations • 20 Feb 2019 • Thinh T. Doan, Siva Theja Maguluri, Justin Romberg

In this problem, a group of agents works cooperatively to evaluate the value function for the global discounted accumulative reward problem, which is composed of local rewards observed by the agents.

Optimization and Control

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.