Dimitri P. Bertsekas

IEEE Trans. Robotics, 2024

Model Predictive Control and Reinforcement Learning: A Unified Framework Based on Dynamic Programming.

[BibT_eX]

[DOI]

CoRR, 2024

An Approximate Dynamic Programming Framework for Occlusion-Robust Multi-Object Tracking.

[BibT_eX]

[DOI]

CoRR, 2024

Most Likely Sequence Generation for n-Grams, Transformers, HMMs, and Markov Chains, by Using Rollout Algorithms.

[BibT_eX]

[DOI]

Yuchao Li

CoRR, 2024

Approximate Multiagent Reinforcement Learning for On-Demand Urban Mobility Problem on a Large Map.

[BibT_eX]

[DOI]

Daniel Garces

Stephanie Gil

Proceedings of the IEEE International Conference on Robotics and Automation, 2024

Distributed Online Rollout for Multivehicle Routing in Unmapped Environments.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems, 2024

2023

Auction-Based Learning for Question Answering over Knowledge Graphs.

[BibT_eX]

[DOI]

Garima Agrawal

Huan Liu

Inf., 2023

Approximate Multiagent Reinforcement Learning for On-Demand Urban Mobility Problem on a Large Map (extended version).

[BibT_eX]

[DOI]

Daniel Garces

Stephanie Gil

CoRR, 2023

New Auction Algorithms for the Assignment Problem and Extensions.

[BibT_eX]

[DOI]

CoRR, 2023

Multiagent Reinforcement Learning for Autonomous Routing and Pickup Problem with Adaptation to Variable Demand.

[BibT_eX]

[DOI]

Daniel Garces

Stephanie Gil

Proceedings of the IEEE International Conference on Robotics and Automation, 2023

Playing Wordle Using an Online Rollout Algorithm for Deterministic POMDPs.

[BibT_eX]

[DOI]

Siddhant Bhambri

Amrita Bhattacharjee

Proceedings of the IEEE Conference on Games, 2023

2022

ExpertRNA: A New Framework for RNA Secondary Structure Prediction.

[BibT_eX]

[DOI]

INFORMS J. Comput., 2022

Rollout Algorithms and Approximate Dynamic Programming for Bayesian Optimization and Sequential Estimation.

[BibT_eX]

[DOI]

CoRR, 2022

Reinforcement Learning Methods for Wordle: A POMDP/Adaptive Control Approach.

[BibT_eX]

[DOI]

Siddhant Bhambri

Amrita Bhattacharjee

CoRR, 2022

New Auction Algorithms for Path Planning, Network Transport, and Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2022

2021

Multiagent Reinforcement Learning: Rollout and Policy Iteration.

[BibT_eX]

[DOI]

IEEE CAA J. Autom. Sinica, 2021

Distributed Asynchronous Policy Iteration for Sequential Zero-Sum Games and Minimax Control.

[BibT_eX]

[DOI]

CoRR, 2021

On-Line Policy Iteration for Infinite Horizon Dynamic Programming.

[BibT_eX]

[DOI]

CoRR, 2021

Data-driven Rollout for Deterministic Optimal Control.

[BibT_eX]

[DOI]

Yuchao Li

Karl Henrik Johansson

Jonas Mårtensson

Proceedings of the 2021 60th IEEE Conference on Decision and Control (CDC), 2021

2020

Reinforcement Learning for POMDP: Partitioned Rollout and Policy Iteration With Application to Autonomous Sequential Repair Problems.

[BibT_eX]

[DOI]

IEEE Robotics Autom. Lett., 2020

Multiagent Value Iteration Algorithms in Dynamic Programming and Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2020

Constrained Multiagent Rollout and Multidimensional Assignment with the Auction Algorithm.

[BibT_eX]

[DOI]

CoRR, 2020

Multiagent Rollout and Policy Iteration for POMDP with Application to Multi-Robot Repair Problems.

[BibT_eX]

[DOI]

Proceedings of the 4th Conference on Robot Learning, 2020

2019

Affine Monotonic and Risk-Sensitive Models in Dynamic Programming.

[BibT_eX]

[DOI]

IEEE Trans. Autom. Control., 2019

Feature-based aggregation and deep reinforcement learning: a survey and some new implementations.

[BibT_eX]

[DOI]

IEEE CAA J. Autom. Sinica, 2019

Biased Aggregation, Rollout, and Enhanced Policy Improvement for Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2019

Multiagent Rollout Algorithms and Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2019

2018

Proper Policies in Infinite-State Stochastic Shortest Path Problems.

[BibT_eX]

[DOI]

IEEE Trans. Autom. Control., 2018

Stable Optimal Control and Semicontractive Dynamic Programming.

[BibT_eX]

[DOI]

SIAM J. Control. Optim., 2018

Proximal algorithms and temporal difference methods for solving fixed point problems.

[BibT_eX]

[DOI]

Comput. Optim. Appl., 2018

2017

Value and Policy Iterations in Optimal Control and Adaptive Dynamic Programming.

[BibT_eX]

[DOI]

IEEE Trans. Neural Networks Learn. Syst., 2017

Regular Policies in Abstract Dynamic Programming.

[BibT_eX]

[DOI]

SIAM J. Optim., 2017

2016

Stochastic First-Order Methods with Random Constraint Projection.

[BibT_eX]

[DOI]

Mengdi Wang

SIAM J. Optim., 2016

Proximal Algorithms and Temporal Differences for Large Linear Systems: Extrapolation, Approximation, and Simulation.

[BibT_eX]

[DOI]

CoRR, 2016

Robust Shortest Path Planning and Semicontractive Dynamic Programming.

[BibT_eX]

[DOI]

CoRR, 2016

2015

Incremental constraint projection methods for variational inequalities.

[BibT_eX]

[DOI]

Mengdi Wang

Math. Program., 2015

A Mixed Value and Policy Iteration Method for Stochastic Control with Universally Measurable Policies.

[BibT_eX]

[DOI]

Math. Oper. Res., 2015

Incremental Aggregated Proximal and Augmented Lagrangian Algorithms.

[BibT_eX]

[DOI]

CoRR, 2015

Incremental Gradient, Subgradient, and Proximal Methods for Convex Optimization: A Survey.

[BibT_eX]

[DOI]

CoRR, 2015

Lambda-Policy Iteration: A Review and a New Implementation.

[BibT_eX]

[DOI]

CoRR, 2015

Value and Policy Iteration in Optimal Control and Adaptive Dynamic Programming.

[BibT_eX]

[DOI]

CoRR, 2015

Centralized and Distributed Newton Methods for Network Optimization and Extensions.

[BibT_eX]

[DOI]

CoRR, 2015

2014

Stabilization of Stochastic Iterative Methods for Singular and Nearly Singular Linear Systems.

[BibT_eX]

[DOI]

Mengdi Wang

Math. Oper. Res., 2014

2013

On Boundedness of Q-Learning Iterates for Stochastic Shortest Path Problems.

[BibT_eX]

[DOI]

Math. Oper. Res., 2013

Q-learning and policy iteration algorithms for stochastic shortest path problems.

[BibT_eX]

[DOI]

Ann. Oper. Res., 2013

2012

Q-Learning and Enhanced Policy Iteration in Discounted Dynamic Programming.

[BibT_eX]

[DOI]

Math. Oper. Res., 2012

2011

Temporal Difference Methods for General Projected Equations.

[BibT_eX]

[DOI]

IEEE Trans. Autom. Control., 2011

A Unifying Polyhedral Approximation Framework for Convex Optimization.

[BibT_eX]

[DOI]

SIAM J. Optim., 2011

Preface.

[BibT_eX]

[DOI]

Zhi-Quan Luo

Math. Program., 2011

Incremental proximal methods for large scale convex optimization.

[BibT_eX]

[DOI]

Math. Program., 2011

2010

The effect of deterministic noise in subgradient methods.

[BibT_eX]

[DOI]

Angelia Nedic

Math. Program., 2010

Error Bounds for Approximations from Projected Linear Equations.

[BibT_eX]

[DOI]

Math. Oper. Res., 2010

Pathologies of temporal difference methods in approximate dynamic programming.

[BibT_eX]

[DOI]

Proceedings of the 49th IEEE Conference on Decision and Control, 2010

Distributed asynchronous policy iteration in dynamic programming.

[BibT_eX]

[DOI]

Proceedings of the 48th Annual Allerton Conference on Communication, 2010

2009

Neuro-Dynamic Programming.

[BibT_eX]

[DOI]

Proceedings of the Encyclopedia of Optimization, Second Edition, 2009

Auction Algorithms.

[BibT_eX]

[DOI]

Proceedings of the Encyclopedia of Optimization, Second Edition, 2009

Convergence Results for Some Temporal Difference Methods Based on Least Squares.

[BibT_eX]

[DOI]

IEEE Trans. Autom. Control., 2009

Basis function adaptation methods for cost approximation in MDP.

[BibT_eX]

[DOI]

Proceedings of the IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, 2009

A unified framework for temporal difference methods.

[BibT_eX]

[DOI]

Proceedings of the IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, 2009

2008

On Near Optimality of the Set of Finite-State Controllers for Average Cost POMDP.

[BibT_eX]

[DOI]

Math. Oper. Res., 2008

New Error Bounds for Approximations from Projected Linear Equations.

[BibT_eX]

[DOI]

Proceedings of the Recent Advances in Reinforcement Learning, 8th European Workshop, 2008

2007

Erratum to "Comments on 'Coordination of Groups of Mobile Autonomous Agents Using Nearest Neighbor Rules'".

[BibT_eX]

[DOI]

IEEE Trans. Autom. Control., 2007

Comments on "Coordination of Groups of Mobile Autonomous Agents Using Nearest Neighbor Rules".

[BibT_eX]

[DOI]

IEEE Trans. Autom. Control., 2007

Separable Dynamic Programming and Approximate Decomposition Methods.

[BibT_eX]

[DOI]

IEEE Trans. Autom. Control., 2007

Set Intersection Theorems and Existence of Optimal Solutions.

[BibT_eX]

[DOI]

Math. Program., 2007

2006

Enhanced Fritz John Conditions for Convex Programming.

[BibT_eX]

[DOI]

Asuman E. Ozdaglar

SIAM J. Optim., 2006

Neuro-Dynamic Programming: An Overview and Recent Results.

[BibT_eX]

[DOI]

Proceedings of the Operations Research, 2006

2005

Dynamic Programming and Suboptimal Control: A Survey from ADP to MPC.

[BibT_eX]

[DOI]

Eur. J. Control, 2005

Dynamic Programming and Suboptimal Control: From ADP to MPC.

[BibT_eX]

[DOI]

Proceedings of the 44th IEEE IEEE Conference on Decision and Control and 8th European Control Conference Control, 2005

Dynamic programming and optimal control, 3rd Edition.

[BibT_eX]

[DOI]

Athena Scientific, ISBN: 1886529264, 2005

2004

The relation between pseudonormality and quasiregularity in constrained optimization.

[BibT_eX]

[DOI]

Asuman E. Ozdaglar

Optim. Methods Softw., 2004

Discretized Approximations for POMDP with Average Cost.

[BibT_eX]

[DOI]

Proceedings of the UAI '04, 2004

2003

Routing and wavelength assignment in optical networks.

[BibT_eX]

[DOI]

Asuman E. Ozdaglar

IEEE/ACM Trans. Netw., 2003

Least Squares Policy Evaluation Algorithms with Linear Function Approximation.

[BibT_eX]

[DOI]

Angelia Nedic

Discret. Event Dyn. Syst., 2003

2002

Stochastic Approximation for Nonexpansive Maps: Application to Q-Learning Algorithms.

[BibT_eX]

[DOI]

Jinane Abounadi

Vivek S. Borkar

SIAM J. Control. Optim., 2002

2001

Distributed power control algorithms for wireless networks.

[BibT_eX]

[DOI]

Cynara Wu

IEEE Trans. Veh. Technol., 2001

Incremental Subgradient Methods for Nondifferentiable Optimization.

[BibT_eX]

[DOI]

Angelia Nedic

SIAM J. Optim., 2001

Learning Algorithms for Markov Decision Processes with Average Cost.

[BibT_eX]

[DOI]

Jinane Abounadi

Vivek S. Borkar

SIAM J. Control. Optim., 2001

Reservation-Based Session Routing for Broadband Communication Networks with Strict QoS Requirements.

[BibT_eX]

[DOI]

Chi-Hsiang Yeh

Hussein T. Mouftah

Proceedings of the 15th International Conference on Information Networking, 2001

2000

Missile defense and interceptor allocation by neuro-dynamic programming.

[BibT_eX]

[DOI]

IEEE Trans. Syst. Man Cybern. Part A, 2000

Gradient Convergence in Gradient methods with Errors.

[BibT_eX]

[DOI]

SIAM J. Optim., 2000

An ε-relaxation method for separable convex cost generalized network flow problems.

[BibT_eX]

[DOI]

Math. Program., 2000

1999

Rollout Algorithms for Stochastic Scheduling Problems.

[BibT_eX]

[DOI]

J. Heuristics, 1999

A Note on Error Bounds for Convex and Nonconvex Programs.

[BibT_eX]

[DOI]

Comput. Optim. Appl., 1999

1998

Implementation of efficient algorithms for globally optimal trajectories.

[BibT_eX]

[DOI]

Lazaros C. Polymenakos

IEEE Trans. Autom. Control., 1998

1997

An ε-Relaxation Method for Separable Convex Cost Network Flow Problems.

[BibT_eX]

[DOI]

Lazaros C. Polymenakos

SIAM J. Optim., 1997

A New Class of Incremental Gradient Methods for Least Squares Problems.

[BibT_eX]

[DOI]

SIAM J. Optim., 1997

Rollout Algorithms for Combinatorial Optimization.

[BibT_eX]

[DOI]

Cynara Wu

J. Heuristics, 1997

1996

A Conflict Sense Routing Protocol and Its Performance for Hypercubes.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 1996

Incremental Least Squares Methods and the Extended Kalman Filter.

[BibT_eX]

[DOI]

SIAM J. Optim., 1996

Finite Termination of Asynchronous Iterative Algorithms.

[BibT_eX]

[DOI]

Serap A. Savari

Parallel Comput., 1996

Reinforcement Learning for Dynamic Channel Allocation in Cellular Telephone Systems.

[BibT_eX]

[DOI]

Satinder Singh

Proceedings of the Advances in Neural Information Processing Systems 9, 1996

A epsilon-Relaxation Method for Generalized Separable Convex Cost Network Flow Problems.

[BibT_eX]

[DOI]

Proceedings of the Integer Programming and Combinatorial Optimization, 1996

Neuro-dynamic programming.

[BibT_eX]

[DOI]

Optimization and neural computation series 3, Athena Scientific, ISBN: 1886529108, 1996

1995

Dynamic Broadcasting in Parallel Computing.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 1995

Transposition of Banded Matrices in Hypercubes: A Nearly Isotropic Task.

[BibT_eX]

[DOI]

Parallel Comput., 1995

Generic rank-one corrections for value iteration in Markovian decision problems.

[BibT_eX]

[DOI]

Oper. Res. Lett., 1995

A Counterexample to Temporal Differences Learning.

[BibT_eX]

[DOI]

Neural Comput., 1995

Polynomial auction algorithms for shortest paths.

[BibT_eX]

[DOI]

Stefano Pallottino

Maria Grazia Scutellà

Comput. Optim. Appl., 1995

1994

Performance of hypercube routing schemes with or without buffering.

[BibT_eX]

[DOI]

IEEE/ACM Trans. Netw., 1994

Partial Proximal Minimization Algorithms for Convex Pprogramming.

[BibT_eX]

[DOI]

SIAM J. Optim., 1994

Parallel Shortest Path Auction Algorithms.

[BibT_eX]

[DOI]

Lazaros Polymenakos

Parallel Comput., 1994

Partial Multinode Broadcast and Partial Exchange Algorithms for d-Dimensional Meshes.

[BibT_eX]

[DOI]

J. Parallel Distributed Comput., 1994

1993

Multinode Broadcast in Hypercubes and Rings with Randomly Distributed Length of Packets.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 1993

Reverse Auction and the Solution of Inequality Constrained Assignment Problems.

[BibT_eX]

[DOI]

Haralampos Tsaknakis

SIAM J. Optim., 1993

A simple and fast label correcting algorithm for shortest paths.

[BibT_eX]

[DOI]

Networks, 1993

On the convergence of the exponential multiplier method for convex programming.

[BibT_eX]

[DOI]

Math. Program., 1993

Parallel Asynchronous Hungarian Methods for the Assignment Problem.

[BibT_eX]

[DOI]

INFORMS J. Comput., 1993

Parallel primal-dual methods for the minimum cost flow problem.

[BibT_eX]

[DOI]

Comput. Optim. Appl., 1993

A generic auction algorithm for the minimum cost network flow problem.

[BibT_eX]

[DOI]

Comput. Optim. Appl., 1993

1992

Communication algorithms for isotropic tasks in hypercubes and wraparound meshes.

[BibT_eX]

[DOI]

Parallel Comput., 1992

On the Douglas-Rachford splitting method and the proximal point algorithm for maximal monotone operators.

[BibT_eX]

[DOI]

Jonathan Eckstein

Math. Program., 1992

A forward/reverse auction algorithm for asymmetric assignment problems.

[BibT_eX]

[DOI]

Comput. Optim. Appl., 1992

Auction algorithms for network flow problems: A tutorial introduction.

[BibT_eX]

[DOI]

Comput. Optim. Appl., 1992

Partial Multinode Broadcast Algorithms for D-Dimensional Meshes.

[BibT_eX]

Proceedings of the 1992 International Conference on Parallel Processing, 1992

Data Networks, Second Edition.

[BibT_eX]

Robert G. Gallager

Prentice Hall, ISBN: 978-0-13-201674-2, 1992

1991

An Auction Algorithm for Shortest Paths.

[BibT_eX]

[DOI]

SIAM J. Optim., 1991

Parallel synchronous and asynchronous implementations of the auction algorithm.

[BibT_eX]

[DOI]

Parallel Comput., 1991

Relaxation Methods for Problems with Strictly Convex Costs and Linear Constraints.

[BibT_eX]

[DOI]

Math. Oper. Res., 1991

An Analysis of Stochastic Shortest Path Problems.

[BibT_eX]

[DOI]

Math. Oper. Res., 1991

Optimal Communication Algorithms for Hypercubes.

[BibT_eX]

[DOI]

J. Parallel Distributed Comput., 1991

Some aspects of parallel and distributed iterative algorithms - A survey<sup>, </sup>.

[BibT_eX]

[DOI]

Autom., 1991

Linear network optimization - algorithms and codes.

[BibT_eX]

MIT Press, ISBN: 978-0-262-02334-4, 1991

1990

Relaxation Methods for Monotropic Programs.

[BibT_eX]

[DOI]

Math. Program., 1990

1989

Convergence rate and termination of asynchronous iterative algorithms.

[BibT_eX]

[DOI]

Proceedings of the 3rd international conference on Supercomputing, 1989

Parallel and distributed computation.

[BibT_eX]

Prentice Hall, ISBN: 978-0-13-648759-3, 1989

1988

Dual coordinate step methods for linear network flow problems.

[BibT_eX]

[DOI]

Jonathan Eckstein

Math. Program., 1988

Relaxation Methods for Minimum Cost Ordinary and Generalized Network Flow Problems.

[BibT_eX]

[DOI]

Oper. Res., 1988

1987

Asymptotic optimality of shortest path routing algorithms.

[BibT_eX]

[DOI]

Eli Gafni

IEEE Trans. Inf. Theory, 1987

Relaxation methods for problems with strictly convex separable costs and linear constraints.

[BibT_eX]

[DOI]

Math. Program., 1987

Relaxation Methods for Linear Programs.

[BibT_eX]

[DOI]

Math. Oper. Res., 1987

1985

A unified framework for primal-dual methods in minimum cost network flow problems.

[BibT_eX]

[DOI]

Math. Program., 1985

1984

Second Derivative Algorithms for Minimum Delay Distributed Routing in Networks.

[BibT_eX]

[DOI]

Eli Gafni

Robert G. Gallager

IEEE Trans. Commun., 1984

1983

Distributed asynchronous computation of fixed points.

[BibT_eX]

[DOI]

Math. Program., 1983

Path assignment for virtual circuit routing.

[BibT_eX]

[DOI]

Eliezer M. Gafni

Proceedings of the symposium on Communications Architectures & Protocols, 1983

1981

Distributed Algorithms for Generating Loop-Free Routes in Networks with Frequently Changing Topology.

[BibT_eX]

[DOI]

Eli Gafni

IEEE Trans. Commun., 1981

A new algorithm for the assignment problem.

[BibT_eX]

[DOI]

Math. Program., 1981

1979

Universally Measurable Policies in Dynamic Programming.

[BibT_eX]

[DOI]

Steven E. Shreve

Math. Oper. Res., 1979

1976

Multiplier methods: A survey.

[BibT_eX]

[DOI]

Autom., 1976

1975

Necessary and sufficient conditions for a penalty method to be exact.

[BibT_eX]

[DOI]

Math. Program., 1975

1971

Control of uncertain systems with a set-membership description of the uncertainty.

[BibT_eX]

[DOI]