Dimitri P. Bertsekas

According to our database1, Dimitri P. Bertsekas authored at least 118 papers between 1975 and 2021.

Collaborative distances:


IEEE Fellow

IEEE Fellow 1984, "For contributions to optimization, data communication networks, and distributed control".



In proceedings 
PhD thesis 


Online presence:

On csauthors.net:


Multiagent Reinforcement Learning: Rollout and Policy Iteration.
IEEE CAA J. Autom. Sinica, 2021

Reinforcement Learning for POMDP: Partitioned Rollout and Policy Iteration With Application to Autonomous Sequential Repair Problems.
IEEE Robotics Autom. Lett., 2020

Multiagent Rollout and Policy Iteration for POMDP with Application to Multi-Robot Repair Problems.
CoRR, 2020

Multiagent Value Iteration Algorithms in Dynamic Programming and Reinforcement Learning.
CoRR, 2020

Constrained Multiagent Rollout and Multidimensional Assignment with the Auction Algorithm.
CoRR, 2020

Affine Monotonic and Risk-Sensitive Models in Dynamic Programming.
IEEE Trans. Autom. Control., 2019

Feature-based aggregation and deep reinforcement learning: a survey and some new implementations.
IEEE CAA J. Autom. Sinica, 2019

Biased Aggregation, Rollout, and Enhanced Policy Improvement for Reinforcement Learning.
CoRR, 2019

Multiagent Rollout Algorithms and Reinforcement Learning.
CoRR, 2019

Proper Policies in Infinite-State Stochastic Shortest Path Problems.
IEEE Trans. Autom. Control., 2018

Stable Optimal Control and Semicontractive Dynamic Programming.
SIAM J. Control. Optim., 2018

Proximal algorithms and temporal difference methods for solving fixed point problems.
Comput. Optim. Appl., 2018

Value and Policy Iterations in Optimal Control and Adaptive Dynamic Programming.
IEEE Trans. Neural Networks Learn. Syst., 2017

Regular Policies in Abstract Dynamic Programming.
SIAM J. Optim., 2017

Stochastic First-Order Methods with Random Constraint Projection.
SIAM J. Optim., 2016

Proximal Algorithms and Temporal Differences for Large Linear Systems: Extrapolation, Approximation, and Simulation.
CoRR, 2016

Robust Shortest Path Planning and Semicontractive Dynamic Programming.
CoRR, 2016

Incremental constraint projection methods for variational inequalities.
Math. Program., 2015

A Mixed Value and Policy Iteration Method for Stochastic Control with Universally Measurable Policies.
Math. Oper. Res., 2015

Incremental Aggregated Proximal and Augmented Lagrangian Algorithms.
CoRR, 2015

Incremental Gradient, Subgradient, and Proximal Methods for Convex Optimization: A Survey.
CoRR, 2015

Lambda-Policy Iteration: A Review and a New Implementation.
CoRR, 2015

Value and Policy Iteration in Optimal Control and Adaptive Dynamic Programming.
CoRR, 2015

Centralized and Distributed Newton Methods for Network Optimization and Extensions.
CoRR, 2015

Stabilization of Stochastic Iterative Methods for Singular and Nearly Singular Linear Systems.
Math. Oper. Res., 2014

On Boundedness of Q-Learning Iterates for Stochastic Shortest Path Problems.
Math. Oper. Res., 2013

Q-learning and policy iteration algorithms for stochastic shortest path problems.
Ann. Oper. Res., 2013

Q-Learning and Enhanced Policy Iteration in Discounted Dynamic Programming.
Math. Oper. Res., 2012

Temporal Difference Methods for General Projected Equations.
IEEE Trans. Autom. Control., 2011

A Unifying Polyhedral Approximation Framework for Convex Optimization.
SIAM J. Optim., 2011

Math. Program., 2011

Incremental proximal methods for large scale convex optimization.
Math. Program., 2011

The effect of deterministic noise in subgradient methods.
Math. Program., 2010

Error Bounds for Approximations from Projected Linear Equations.
Math. Oper. Res., 2010

Pathologies of temporal difference methods in approximate dynamic programming.
Proceedings of the 49th IEEE Conference on Decision and Control, 2010

Neuro-Dynamic Programming.
Proceedings of the Encyclopedia of Optimization, Second Edition, 2009

Auction Algorithms.
Proceedings of the Encyclopedia of Optimization, Second Edition, 2009

Convergence Results for Some Temporal Difference Methods Based on Least Squares.
IEEE Trans. Autom. Control., 2009

Basis function adaptation methods for cost approximation in MDP.
Proceedings of the IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, 2009

A unified framework for temporal difference methods.
Proceedings of the IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, 2009

On Near Optimality of the Set of Finite-State Controllers for Average Cost POMDP.
Math. Oper. Res., 2008

New Error Bounds for Approximations from Projected Linear Equations.
Proceedings of the Recent Advances in Reinforcement Learning, 8th European Workshop, 2008

Erratum to "Comments on 'Coordination of Groups of Mobile Autonomous Agents Using Nearest Neighbor Rules'".
IEEE Trans. Autom. Control., 2007

Comments on "Coordination of Groups of Mobile Autonomous Agents Using Nearest Neighbor Rules".
IEEE Trans. Autom. Control., 2007

Separable Dynamic Programming and Approximate Decomposition Methods.
IEEE Trans. Autom. Control., 2007

Set Intersection Theorems and Existence of Optimal Solutions.
Math. Program., 2007

Enhanced Fritz John Conditions for Convex Programming.
SIAM J. Optim., 2006

Neuro-Dynamic Programming: An Overview and Recent Results.
Proceedings of the Operations Research, 2006

Dynamic Programming and Suboptimal Control: A Survey from ADP to MPC.
Eur. J. Control, 2005

Dynamic programming and optimal control, 3rd Edition.
Athena Scientific, ISBN: 1886529264, 2005

The relation between pseudonormality and quasiregularity in constrained optimization.
Optim. Methods Softw., 2004

Discretized Approximations for POMDP with Average Cost.
Proceedings of the UAI '04, 2004

Routing and wavelength assignment in optical networks.
IEEE/ACM Trans. Netw., 2003

Least Squares Policy Evaluation Algorithms with Linear Function Approximation.
Discret. Event Dyn. Syst., 2003

Stochastic Approximation for Nonexpansive Maps: Application to Q-Learning Algorithms.
SIAM J. Control. Optim., 2002

Distributed power control algorithms for wireless networks.
IEEE Trans. Veh. Technol., 2001

Incremental Subgradient Methods for Nondifferentiable Optimization.
SIAM J. Optim., 2001

Learning Algorithms for Markov Decision Processes with Average Cost.
SIAM J. Control. Optim., 2001

Reservation-Based Session Routing for Broadband Communication Networks with Strict QoS Requirements.
Proceedings of the 15th International Conference on Information Networking, 2001

Missile defense and interceptor allocation by neuro-dynamic programming.
IEEE Trans. Syst. Man Cybern. Part A, 2000

Gradient Convergence in Gradient methods with Errors.
SIAM J. Optim., 2000

An ε-relaxation method for separable convex cost generalized network flow problems.
Math. Program., 2000

Rollout Algorithms for Stochastic Scheduling Problems.
J. Heuristics, 1999

A Note on Error Bounds for Convex and Nonconvex Programs.
Comput. Optim. Appl., 1999

An ε-Relaxation Method for Separable Convex Cost Network Flow Problems.
SIAM J. Optim., 1997

A New Class of Incremental Gradient Methods for Least Squares Problems.
SIAM J. Optim., 1997

Rollout Algorithms for Combinatorial Optimization.
J. Heuristics, 1997

A Conflict Sense Routing Protocol and Its Performance for Hypercubes.
IEEE Trans. Computers, 1996

Incremental Least Squares Methods and the Extended Kalman Filter.
SIAM J. Optim., 1996

Finite Termination of Asynchronous Iterative Algorithms.
Parallel Comput., 1996

Reinforcement Learning for Dynamic Channel Allocation in Cellular Telephone Systems.
Proceedings of the Advances in Neural Information Processing Systems 9, 1996

A epsilon-Relaxation Method for Generalized Separable Convex Cost Network Flow Problems.
Proceedings of the Integer Programming and Combinatorial Optimization, 1996

Neuro-dynamic programming.
Optimization and neural computation series 3, Athena Scientific, ISBN: 1886529108, 1996

Dynamic Broadcasting in Parallel Computing.
IEEE Trans. Parallel Distributed Syst., 1995

Transposition of Banded Matrices in Hypercubes: A Nearly Isotropic Task.
Parallel Comput., 1995

Generic rank-one corrections for value iteration in Markovian decision problems.
Oper. Res. Lett., 1995

A Counterexample to Temporal Differences Learning.
Neural Comput., 1995

Polynomial auction algorithms for shortest paths.
Comput. Optim. Appl., 1995

Performance of hypercube routing schemes with or without buffering.
IEEE/ACM Trans. Netw., 1994

Partial Proximal Minimization Algorithms for Convex Pprogramming.
SIAM J. Optim., 1994

Parallel Shortest Path Auction Algorithms.
Parallel Comput., 1994

Partial Multinode Broadcast and Partial Exchange Algorithms for d-Dimensional Meshes.
J. Parallel Distributed Comput., 1994

Multinode Broadcast in Hypercubes and Rings with Randomly Distributed Length of Packets.
IEEE Trans. Parallel Distributed Syst., 1993

Reverse Auction and the Solution of Inequality Constrained Assignment Problems.
SIAM J. Optim., 1993

A simple and fast label correcting algorithm for shortest paths.
Networks, 1993

On the convergence of the exponential multiplier method for convex programming.
Math. Program., 1993

Parallel Asynchronous Hungarian Methods for the Assignment Problem.
INFORMS J. Comput., 1993

Parallel primal-dual methods for the minimum cost flow problem.
Comput. Optim. Appl., 1993

A generic auction algorithm for the minimum cost network flow problem.
Comput. Optim. Appl., 1993

Communication algorithms for isotropic tasks in hypercubes and wraparound meshes.
Parallel Comput., 1992

On the Douglas-Rachford splitting method and the proximal point algorithm for maximal monotone operators.
Math. Program., 1992

A forward/reverse auction algorithm for asymmetric assignment problems.
Comput. Optim. Appl., 1992

Auction algorithms for network flow problems: A tutorial introduction.
Comput. Optim. Appl., 1992

Partial Multinode Broadcast Algorithms for D-Dimensional Meshes.
Proceedings of the 1992 International Conference on Parallel Processing, 1992

Data Networks, Second Edition.
Prentice Hall, ISBN: 978-0-13-201674-2, 1992

An Auction Algorithm for Shortest Paths.
SIAM J. Optim., 1991

Parallel synchronous and asynchronous implementations of the auction algorithm.
Parallel Comput., 1991

Relaxation Methods for Problems with Strictly Convex Costs and Linear Constraints.
Math. Oper. Res., 1991

An Analysis of Stochastic Shortest Path Problems.
Math. Oper. Res., 1991

Optimal Communication Algorithms for Hypercubes.
J. Parallel Distributed Comput., 1991

Some aspects of parallel and distributed iterative algorithms - A survey<sup>, </sup>.
Autom., 1991

Linear network optimization - algorithms and codes.
MIT Press, ISBN: 978-0-262-02334-4, 1991

Relaxation Methods for Monotropic Programs.
Math. Program., 1990

Convergence rate and termination of asynchronous iterative algorithms.
Proceedings of the 3rd international conference on Supercomputing, 1989

Parallel and distributed computation.
Prentice Hall, ISBN: 978-0-13-648759-3, 1989

Dual coordinate step methods for linear network flow problems.
Math. Program., 1988

Relaxation Methods for Minimum Cost Ordinary and Generalized Network Flow Problems.
Oper. Res., 1988

Asymptotic optimality of shortest path routing algorithms.
IEEE Trans. Inf. Theory, 1987

Relaxation methods for problems with strictly convex separable costs and linear constraints.
Math. Program., 1987

Relaxation Methods for Linear Programs.
Math. Oper. Res., 1987

A unified framework for primal-dual methods in minimum cost network flow problems.
Math. Program., 1985

Second Derivative Algorithms for Minimum Delay Distributed Routing in Networks.
IEEE Trans. Commun., 1984

Distributed asynchronous computation of fixed points.
Math. Program., 1983

Distributed Algorithms for Generating Loop-Free Routes in Networks with Frequently Changing Topology.
IEEE Trans. Commun., 1981

A new algorithm for the assignment problem.
Math. Program., 1981

Universally Measurable Policies in Dynamic Programming.
Math. Oper. Res., 1979

Multiplier methods: A survey.
Autom., 1976

Necessary and sufficient conditions for a penalty method to be exact.
Math. Program., 1975