Shalabh Bhatnagar

IEEE Trans. Autom. Control., 2021

On tight bounds for function approximation error in risk-sensitive reinforcement learning.

[BibT_eX]

[DOI]

Syst. Control. Lett., 2021

N-Timescale Stochastic Approximation: Stability and Convergence.

[BibT_eX]

[DOI]

Rohan Deb

CoRR, 2021

Finite Horizon Q-learning: Stability, Convergence and Simulations.

[BibT_eX]

[DOI]

Vivek VP

CoRR, 2021

Novel First Order Bayesian Optimization with an Application to Reinforcement Learning.

[BibT_eX]

[DOI]

Santosh Penubothula

Appl. Intell., 2021

Attention Actor-Critic Algorithm for Multi-Agent Constrained Co-operative Reinforcement Learning.

[BibT_eX]

[DOI]

P. Parnika

Proceedings of the AAMAS '21: 20th International Conference on Autonomous Agents and Multiagent Systems, 2021

2020

Analysis of Stochastic Approximation Schemes With Set-Valued Maps in the Absence of a Stability Guarantee and Their Stabilization.

[BibT_eX]

[DOI]

Vinayaka G. Yaji

IEEE Trans. Autom. Control., 2020

Random Directions Stochastic Approximation With Deterministic Perturbations.

[BibT_eX]

[DOI]

IEEE Trans. Autom. Control., 2020

Stochastic Recursive Inclusions in Two Timescales with Nonadditive Iterate-Dependent Markov Noise.

[BibT_eX]

[DOI]

Vinayaka G. Yaji

Math. Oper. Res., 2020

Successive Over-Relaxation ${Q}$ -Learning.

[BibT_eX]

[DOI]

IEEE Control. Syst. Lett., 2020

Generalized Speedy Q-Learning.

[BibT_eX]

[DOI]

IEEE Control. Syst. Lett., 2020

Robust Quadrupedal Locomotion on Sloped Terrains: A Linear Policy Approach.

[BibT_eX]

[DOI]

CoRR, 2020

Hindsight Experience Replay with Kronecker Product Approximate Curvature.

[BibT_eX]

[DOI]

Dhuruva Priyan G. M

Abhik Singla

CoRR, 2020

A reinforcement learning approach to hybrid control design.

[BibT_eX]

[DOI]

Meet Gandhi

Atreyee Kundu

Annanya Pratap Singh Chauhan

CoRR, 2020

A Stochastic Game Framework for Efficient Energy Management in Microgrid Networks.

[BibT_eX]

[DOI]

Shravan Nayak

Chanakya Ajit Ekbote

CoRR, 2020

Reinforcement learning algorithm for non-stationary environments.

[BibT_eX]

[DOI]

Appl. Intell., 2020

Learning Stable Manoeuvres in Quadruped Robots from Expert Demonstrations.

[BibT_eX]

[DOI]

Sashank Tirumala

Sagar Venkatesh Gubbi

Proceedings of the 29th IEEE International Conference on Robot and Human Interactive Communication, 2020

Learning-Based Resource Allocation in Industrial IoT Systems.

[BibT_eX]

[DOI]

Shilpa Rao

Annanya Pratap Singh Chauhan

Proceedings of the 31st IEEE Annual International Symposium on Personal, 2020

Stochastic Game Frameworks for Efficient Energy Management in Microgrid Networks.

[BibT_eX]

[DOI]

Shravan Nayak

Chanakya Ajit Ekbote

Proceedings of the IEEE PES Innovative Smart Grid Technologies Europe, 2020

Deep Reinforcement Learning with Successive Over-Relaxation and its Application in Autoscaling Cloud Resources.

[BibT_eX]

[DOI]

Proceedings of the 2020 International Joint Conference on Neural Networks, 2020

A Convergent Off-Policy Temporal Difference Algorithm.

[BibT_eX]

[DOI]

Proceedings of the ECAI 2020 - 24th European Conference on Artificial Intelligence, 29 August-8 September 2020, Santiago de Compostela, Spain, August 29 - September 8, 2020, 2020

Robust Quadrupedal Locomotion on Sloped Terrains: A Linear Policy Approach.

[BibT_eX]

[DOI]

Proceedings of the 4th Conference on Robot Learning, 2020

Hierarchical Average Reward Policy Gradient Algorithms (Student Abstract).

[BibT_eX]

[DOI]

Akshay Dharmavaram

Matthew Riemer

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019

Stability of Stochastic Approximations With "Controlled Markov" Noise and Temporal Difference Learning.

[BibT_eX]

[DOI]

IEEE Trans. Autom. Control., 2019

An Online Sample-Based Method for Mode Estimation Using ODE Analysis of Stochastic Approximation Algorithms.

[BibT_eX]

[DOI]

IEEE Control. Syst. Lett., 2019

Gait Library Synthesis for Quadruped Robots via Augmented Random Search.

[BibT_eX]

[DOI]

CoRR, 2019

Hierarchical Average Reward Policy Gradient Algorithms.

[BibT_eX]

[DOI]

Akshay Dharmavaram

Matthew Riemer

CoRR, 2019

Solution of Two-Player Zero-Sum Game by Successive Relaxation.

[BibT_eX]

[DOI]

CoRR, 2019

Reinforcement Learning in Non-Stationary Environments.

[BibT_eX]

[DOI]

CoRR, 2019

Second Order Value Iteration in Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2019

Design, Development and Experimental Realization of a Quadrupedal Research Platform: Stoch.

[BibT_eX]

[DOI]

CoRR, 2019

Efficient Adaptive Resource Provisioning for Cloud Applications using Reinforcement Learning.

[BibT_eX]

[DOI]

Aiswarya Sreekantan

Proceedings of the IEEE 4th International Workshops on Foundations and Applications of Self* Systems, 2019

Trajectory based Deep Policy Search for Quadrupedal Walking.

[BibT_eX]

[DOI]

Proceedings of the 28th IEEE International Conference on Robot and Human Interactive Communication, 2019

Learning Active Spine Behaviors for Dynamic and Efficient Locomotion in Quadruped Robots.

[BibT_eX]

[DOI]

Proceedings of the 28th IEEE International Conference on Robot and Human Interactive Communication, 2019

Realizing Learned Quadruped Locomotion Behaviors through Kinematic Motion Primitives.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Robotics and Automation, 2019

Predictive and Prescriptive Analytics for Performance Optimization: Framework and a Case Study on a Large-Scale Enterprise System.

[BibT_eX]

[DOI]

Ravikumar Karumanchi

Proceedings of the 18th IEEE International Conference On Machine Learning And Applications, 2019

Efficient Budget Allocation and Task Assignment in Crowdsourcing.

[BibT_eX]

[DOI]

Proceedings of the ACM India Joint International Conference on Data Science and Management of Data, 2019

An Adaptive and Incremental Approach to Quantile Estimation.

[BibT_eX]

[DOI]

Proceedings of the 58th IEEE Conference on Decision and Control, 2019

Actor-Critic Algorithms for Constrained Multi-agent Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, 2019

Stochastic Approximation Trackers for Model-Based Search.

[BibT_eX]

[DOI]

Proceedings of the 57th Annual Allerton Conference on Communication, 2019

2018

Novel Sensor Scheduling Scheme for Intruder Tracking in Energy Efficient Sensor Networks.

[BibT_eX]

[DOI]

IEEE Wirel. Commun. Lett., 2018

A stochastic approximation approach to active queue management.

[BibT_eX]

[DOI]

Sanjeev Patel

Telecommun. Syst., 2018

Analysis of Gradient Descent Methods With Nondiminishing Bounded Errors.

[BibT_eX]

[DOI]

IEEE Trans. Autom. Control., 2018

A Linearly Relaxed Approximate Linear Program for Markov Decision Processes.

[BibT_eX]

[DOI]

Csaba Szepesvári

IEEE Trans. Autom. Control., 2018

Two Time-Scale Stochastic Approximation with Controlled Markov Noise and Off-Policy Temporal-Difference Learning.

[BibT_eX]

[DOI]

Math. Oper. Res., 2018

An online prediction algorithm for reinforcement learning with linear function approximation using cross entropy method.

[BibT_eX]

[DOI]

Mach. Learn., 2018

An incremental off-policy search in a model-free Markov decision process using a single sample path.

[BibT_eX]

[DOI]

Mach. Learn., 2018

Gradient-Based Adaptive Stochastic Search for Simulation Optimization Over Continuous Space.

[BibT_eX]

[DOI]

Enlu Zhou

INFORMS J. Comput., 2018

A Cross Entropy based Optimization Algorithm with Global Convergence Guarantees.

[BibT_eX]

[DOI]

CoRR, 2018

A unified decision making framework for supply and demand management in microgrid networks.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Communications, 2018

Generalized Deterministic Perturbations For Stochastic Gradient Search.

[BibT_eX]

[DOI]

Proceedings of the 57th IEEE Conference on Decision and Control, 2018

2017

Adaptive mean queue size and its rate of change: queue management with random dropping.

[BibT_eX]

[DOI]

Sanjeev Patel

Telecommun. Syst., 2017

Adaptive System Optimization Using Random Directions Stochastic Approximation.

[BibT_eX]

[DOI]

IEEE Trans. Autom. Control., 2017

A Generalization of the Borkar-Meyn Theorem for Stochastic Recursive Inclusions.

[BibT_eX]

[DOI]

Nagendra Dwarakanath Gulur

Math. Oper. Res., 2017

RLWS: A Reinforcement Learning based GPU Warp Scheduler.

[BibT_eX]

[DOI]

Jayvant Anantpur

Shivaram Kalyanakrishnan

R. Govindarajan

CoRR, 2017

A unified decision making framework for supply and demand management in microgrid networks.

[BibT_eX]

[DOI]

Krishnasuri Narayanam

CoRR, 2017

Conditions for Stability and Convergence of Set-Valued Stochastic Approximations: Applications to Approximate Value and Fixed point Iterations with Noise.

[BibT_eX]

[DOI]

CoRR, 2017

Multi-Agent Q-Learning for Minimizing Demand-Supply Power Deficit in Microgrids.

[BibT_eX]

[DOI]

CoRR, 2017

Deterministic Perturbations For Simultaneous Perturbation Methods Using Circulant Matrices.

[BibT_eX]

[DOI]

Chandramouli K

CoRR, 2017

Quasi-Newton smoothed functional algorithms for unconstrained and constrained simulation optimization.

[BibT_eX]

[DOI]

Comput. Optim. Appl., 2017

A stability criterion for two timescale stochastic approximation schemes.

[BibT_eX]

[DOI]

Autom., 2017

An Incremental Fast Policy Search Using a Single Sample Path.

[BibT_eX]

[DOI]

Proceedings of the Pattern Recognition and Machine Intelligence, 2017

Bounds for off-policy prediction in reinforcement learning.

[BibT_eX]

[DOI]

Proceedings of the 2017 International Joint Conference on Neural Networks, 2017

A model based search method for prediction in model-free Markov decision process.

[BibT_eX]

[DOI]

Proceedings of the 2017 International Joint Conference on Neural Networks, 2017

Scalable Performance Tuning of Hadoop MapReduce: A Noisy Gradient Approach.

[BibT_eX]

[DOI]

Sandeep Kumar

Priyank Parihar

K. Gopinath

Proceedings of the 2017 IEEE 10th International Conference on Cloud Computing (CLOUD), 2017

2016

Actor-Critic Algorithms with Online Feature Adaptation.

[BibT_eX]

[DOI]

ACM Trans. Model. Comput. Simul., 2016

A constrained optimization perspective on actor-critic algorithms and application to network routing.

[BibT_eX]

[DOI]

Syst. Control. Lett., 2016

Multiscale Q-learning with linear function approximation.

[BibT_eX]

[DOI]

Discret. Event Dyn. Syst., 2016

Stochastic Recursive Inclusions in two timescales with non-additive iterate dependent Markov noise.

[BibT_eX]

[DOI]

CoRR, 2016

Stochastic Recursive Inclusions with Non-Additive Iterate-Dependent Markov Noise.

[BibT_eX]

[DOI]

CoRR, 2016

Gradient-based learning algorithms with constant-error estimators: stability and convergence.

[BibT_eX]

[DOI]

CoRR, 2016

Performance Tuning of Hadoop MapReduce: A Noisy Gradient Approach.

[BibT_eX]

[DOI]

Sandeep Kumar

Priyank Parihar

K. Gopinath

CoRR, 2016

On a convergent off -policy temporal difference learning algorithm in on-line learning environment.

[BibT_eX]

[DOI]

Raj Kumar Maity

CoRR, 2016

A note on the function approximation error bound for risk-sensitive reinforcement learning.

[BibT_eX]

[DOI]

CoRR, 2016

A Cross Entropy based Stochastic Approximation Algorithm for Reinforcement Learning with Linear Function Approximation.

[BibT_eX]

[DOI]

CoRR, 2016

A randomized algorithm for continuous optimization.

[BibT_eX]

[DOI]

Proceedings of the Winter Simulation Conference, 2016

Scalable focussed entity resolution.

[BibT_eX]

[DOI]

Ranganath B. N.

Proceedings of the 2016 International Joint Conference on Neural Networks, 2016

Shaping Proto-Value Functions Using Rewards.

[BibT_eX]

[DOI]

Raj Kumar Maity

Proceedings of the ECAI 2016 - 22nd European Conference on Artificial Intelligence, 29 August-2 September 2016, The Hague, The Netherlands, 2016

Revisiting the Cross Entropy Method with Applications in Stochastic Global Optimization and Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the ECAI 2016 - 22nd European Conference on Artificial Intelligence, 29 August-2 September 2016, The Hague, The Netherlands, 2016

Improved Hessian estimation for adaptive random directions stochastic approximation.

[BibT_eX]

[DOI]

Proceedings of the 55th IEEE Conference on Decision and Control, 2016

2015

Energy Sharing for Multiple Sensor Nodes With Finite Buffers.

[BibT_eX]

[DOI]

IEEE Trans. Commun., 2015

Simultaneous perturbation methods for adaptive labor staffing in service systems.

[BibT_eX]

[DOI]

Simul., 2015

Necessary and sufficient conditions for optimality in constrained general sum stochastic games.

[BibT_eX]

[DOI]

Syst. Control. Lett., 2015

Simultaneous Perturbation Newton Algorithms for Simulation Optimization.

[BibT_eX]

[DOI]

J. Optim. Theory Appl., 2015

A bi-convex optimization problem to compute Nash equilibrium in n-player games and an algorithm.

[BibT_eX]

[DOI]

CoRR, 2015

Stochastic recursive inclusions with two timescales.

[BibT_eX]

[DOI]

CoRR, 2015

A Study of Gradient Descent Schemes for General-Sum Stochastic Games.

[BibT_eX]

[DOI]

Chandrashekar Lakshmi Narayanan

CoRR, 2015

Shaping Proto-Value Functions via Rewards.

[BibT_eX]

[DOI]

Raj Kumar Maity

CoRR, 2015

Two Timescale Stochastic Approximation with Controlled Markov noise.

[BibT_eX]

[DOI]

CoRR, 2015

Adaptive system optimization using (simultaneous) random directions stochastic approximation.

[BibT_eX]

[DOI]

CoRR, 2015

A Stochastic Approximation Algorithm for Quantile Estimation.

[BibT_eX]

[DOI]

Proceedings of the Neural Information Processing - 22nd International Conference, 2015

Decentralized learning for traffic signal control.

[BibT_eX]

[DOI]

Hemanth Kumar A. N

Proceedings of the 7th International Conference on Communication Systems and Networks, 2015

Two-Timescale Algorithms for Learning Nash Equilibria in General-Sum Stochastic Games.

[BibT_eX]

[DOI]

Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems, 2015

A Generalized Reduced Linear Program for Markov Decision Processes.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015

2014

Two timescale convergent Q-learning for sleep-scheduling in wireless sensor networks.

[BibT_eX]

[DOI]

Abhranil Chatterjee

Wirel. Networks, 2014

Smoothed Functional Algorithms for Stochastic Optimization Using <i>q</i>-Gaussian Distributions.

[BibT_eX]

[DOI]

ACM Trans. Model. Comput. Simul., 2014

A simulation-based algorithm for optimal pricing policy under demand uncertainty.

[BibT_eX]

[DOI]

Saswata Chakravarty

Int. Trans. Oper. Res., 2014

Algorithms for Nash Equilibria in General-Sum Stochastic Games.

[BibT_eX]

[DOI]

CoRR, 2014

Approximate Dynamic Programming based on Projection onto the (min, +) subsemimodule.

[BibT_eX]

[DOI]

CoRR, 2014

Newton-based stochastic optimization using q-Gaussian smoothed functional algorithms.

[BibT_eX]

[DOI]

Autom., 2014

Simulation optimization via gradient-based stochastic search.

[BibT_eX]

[DOI]

Enlu Zhou

Xi Chen

Proceedings of the 2014 Winter Simulation Conference, 2014

Universal Option Models.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Multi-agent reinforcement learning for traffic signal control.

[BibT_eX]

[DOI]

Hemanth Kumar A. N

Proceedings of the 17th International IEEE Conference on Intelligent Transportation Systems, 2014

A Markov Decision Process Framework for Predictable Job Completion Times on Crowdsourcing Platforms.

[BibT_eX]

[DOI]

Ayush Dubey

Chithralekha Balamurugan

Proceedings of the Seconf AAAI Conference on Human Computation and Crowdsourcing, 2014

Adaptive sleep-wake control using reinforcement learning in sensor networks.

[BibT_eX]

[DOI]

Abhranil Chatterjee

Proceedings of the Sixth International Conference on Communication Systems and Networks, 2014

Approximate Dynamic Programming with (min; +) linear function approximation for Markov decision processes.

[BibT_eX]

[DOI]

Proceedings of the 53rd IEEE Conference on Decision and Control, 2014

An actor critic algorithm based on Grassmanian search.

[BibT_eX]

[DOI]

Proceedings of the 53rd IEEE Conference on Decision and Control, 2014

2013

Q-Learning Based Energy Management Policies for a Single Sensor Node with Finite Buffer.

[BibT_eX]

[DOI]

Sunil Kumar Meena

IEEE Wirel. Commun. Lett., 2013

Feature Search in the Grassmanian in Online Reinforcement Learning.

[BibT_eX]

[DOI]

Prashanth Lakshmanrao Ananthapadmanabharao

IEEE J. Sel. Top. Signal Process., 2013

Reinforcement Learning for Sleep-Wake Scheduling in Sensor Networks.

[BibT_eX]

[DOI]

Abhranil Chatterjee

Prashanth Lakshmanrao Ananthapadmanabharao

CoRR, 2013

Mechanisms for hostile agents with capacity constraints.

[BibT_eX]

[DOI]

Horabailu Laxminarayana Prasad

Nirmit Desai

Proceedings of the International conference on Autonomous Agents and Multi-Agent Systems, 2013

2012

Threshold Tuning Using Stochastic Optimization for Graded Signal Control.

[BibT_eX]

[DOI]

IEEE Trans. Veh. Technol., 2012

An Online Actor-Critic Algorithm with Function Approximation for Constrained Markov Decision Processes.

[BibT_eX]

[DOI]

J. Optim. Theory Appl., 2012

Smoothed Functional Algorithms for Stochastic Optimization using q-Gaussian Distributions.

[BibT_eX]

[DOI]

CoRR, 2012

q-Gaussian based Smoothed Functional Algorithm for Stochastic Optimization

[BibT_eX]

[DOI]

CoRR, 2012

Optimal multi-layered congestion based pricing schemes for enhanced QoS.

[BibT_eX]

[DOI]

Koteswara Rao Vemu

Comput. Networks, 2012

General-sum stochastic games: Verifiability conditions for Nash equilibria.

[BibT_eX]

[DOI]

Autom., 2012

q-Gaussian based Smoothed Functional algorithms for stochastic optimization.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE International Symposium on Information Theory, 2012

A novel Q-learning algorithm with function approximation for constrained Markov decision processes.

[BibT_eX]

[DOI]

Proceedings of the 50th Annual Allerton Conference on Communication, 2012

2011

Stochastic approximation algorithms for constrained optimization via simulation.

[BibT_eX]

[DOI]

ACM Trans. Model. Comput. Simul., 2011

Reinforcement Learning With Function Approximation for Traffic Signal Control.

[BibT_eX]

[DOI]

IEEE Trans. Intell. Transp. Syst., 2011

An Optimized SDE Model for Slotted Aloha.

[BibT_eX]

[DOI]

IEEE Trans. Commun., 2011

Stochastic Algorithms for Discrete Parameter Simulation Optimization.

[BibT_eX]

[DOI]

IEEE Trans Autom. Sci. Eng., 2011

The Borkar-Meyn theorem for asynchronous stochastic approximations.

[BibT_eX]

[DOI]

Syst. Control. Lett., 2011

Reinforcement learning with average cost for adaptive control of traffic lights at intersections.

[BibT_eX]

[DOI]

Proceedings of the 14th International IEEE Conference on Intelligent Transportation Systems, 2011

Stochastic Optimization for Adaptive Labor Staffing in Service Systems.

[BibT_eX]

[DOI]

Gargi Banerjee Dasgupta

Proceedings of the Service-Oriented Computing - 9th International Conference, 2011

Smoothed Functional and Quasi-Newton Algorithms for Routing in Multi-stage Queueing Network with Constraints.

[BibT_eX]

[DOI]

Proceedings of the Distributed Computing and Internet Technology, 2011

2010

An efficient algorithm for scheduling in bluetooth piconets and scatternets.

[BibT_eX]

[DOI]

G. Ramana Reddy

V. Rakesh

Vijay Prakash Chaturvedi

Wirel. Networks, 2010

Optimized Policies for the Retransmission Probabilities in Slotted Aloha.

[BibT_eX]

[DOI]

Anshuk Chakraborty

Simul., 2010

An actor-critic algorithm with function approximation for discounted cost constrained Markov decision processes.

[BibT_eX]

[DOI]

Syst. Control. Lett., 2010

Toward Off-Policy Learning Control with Function Approximation.

[BibT_eX]

[DOI]

Proceedings of the 27th International Conference on Machine Learning (ICML-10), 2010

2009

Pattern Synthesis for Nonparametric Pattern Recognition.

[BibT_eX]

[DOI]

Proceedings of the Encyclopedia of Data Warehousing and Mining, Second Edition (4 Volumes), 2009

Optimal parameter trajectory estimation in parameterized SDEs: An algorithmic procedure.

[BibT_eX]

[DOI]

ACM Trans. Model. Comput. Simul., 2009

A probabilistic constrained nonlinear optimization framework to optimize RED parameters.

[BibT_eX]

[DOI]

Rajesh Kumar Patro

Perform. Evaluation, 2009

A proof of convergence of the B-RED and P-RED algorithms for random early detection.

[BibT_eX]

[DOI]

Rajesh Kumar Patro

IEEE Commun. Lett., 2009

Natural actor-critic algorithms.

[BibT_eX]

[DOI]

Autom., 2009

Multi-Step Dyna Planning for Policy Evaluation and Control.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7-10 December 2009, 2009

Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7-10 December 2009, 2009

Fast gradient-descent methods for temporal-difference learning with linear function approximation.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual International Conference on Machine Learning, 2009

LMS-2: Towards an algorithm that is as cheap as LMS and almost as efficient as RLS.

[BibT_eX]

[DOI]

Hengshuai Yao

Csaba Szepesvári

Proceedings of the 48th IEEE Conference on Decision and Control, 2009

2008

Simulation-Based Optimization Algorithms for Finite-Horizon Markov Decision Processes.

[BibT_eX]

[DOI]

Simul., 2008

An efficient ad recommendation system for TV programs.

[BibT_eX]

[DOI]

Multim. Syst., 2008

New algorithms of the Q-learning type.

[BibT_eX]

[DOI]

K. Mohan Babu

Autom., 2008

Ant Colony Optimization Algorithms for Shortest Path Problems.

[BibT_eX]

[DOI]

Sudha Rani Kolavali

Proceedings of the Network Control and Optimization, Second Euro-NF Workshop, 2008

SPSA based feature relevance estimation for video retrieval.

[BibT_eX]

[DOI]

Proceedings of the International Workshop on Multimedia Signal Processing, 2008

2007

Adaptive Newton-based multivariate smoothed functional algorithms for simulation optimization.

[BibT_eX]

[DOI]

ACM Trans. Model. Comput. Simul., 2007

Gelfand-Yaglom-Perez theorem for generalized relative entropy functionals.

[BibT_eX]

[DOI]

Inf. Sci., 2007

Reinforcement Learning Based Algorithms for Average Cost Markov Decision Processes.

[BibT_eX]

[DOI]

Discret. Event Dyn. Syst., 2007

Incremental Natural Actor-Critic Algorithms.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 20, 2007

An Optimal Weighted-Average Congestion Based Pricing Scheme for Enhanced QoS.

[BibT_eX]

[DOI]

Koteswara Rao Vemu

Proceedings of the Distributed Computing and Internet Technology, 2007

An Efficient and Optimized Bluetooth Scheduling Algorithm for Piconets.

[BibT_eX]

[DOI]

Vijay Prakash Chaturvedi

V. Rakesh

Proceedings of the Distributed Computing and Internet Technology, 2007

Fuzzy Clustering Based Ad Recommendation for TV Programs.

[BibT_eX]

[DOI]

Proceedings of the Interactive TV: a Shared Experience, 5th European Conference, 2007

Link route pricing for enhanced QoS.

[BibT_eX]

[DOI]

Koteswara Rao Vemu

Proceedings of the 46th IEEE Conference on Decision and Control, 2007

Discrete parameter simulation optimization algorithms with applications to admission control with dependent service times.

[BibT_eX]

[DOI]

Proceedings of the 46th IEEE Conference on Decision and Control, 2007

Network flow-control using asynchronous stochastic approximation.

[BibT_eX]

[DOI]

Proceedings of the 46th IEEE Conference on Decision and Control, 2007

Solving MDPs using Two-timescale Simulated Annealing with Multiplicative Weights.

[BibT_eX]

[DOI]

Proceedings of the American Control Conference, 2007

Parametrized Actor-Critic Algorithms for Finite-Horizon MDPs.

[BibT_eX]

[DOI]

Proceedings of the American Control Conference, 2007

2006

Robust optimization of Random Early Detection.

[BibT_eX]

[DOI]

Rahul Vaidya

Telecommun. Syst., 2006

Partition based pattern synthesis technique with efficient algorithms for nearest neighbor classification.

[BibT_eX]

[DOI]

Pattern Recognit. Lett., 2006

A Simulation-Based Algorithm for Ergodic Control of Markov Chains Conditioned on Rare Events.

[BibT_eX]

[DOI]

Madhukar Akarapu

J. Mach. Learn. Res., 2006

On Measure Theoretic definitions of Generalized Information Measures and Maximum Entropy Prescriptions

[BibT_eX]

[DOI]

CoRR, 2006

Actor-critic algorithms for hierarchical Markov decision processes.

[BibT_eX]

[DOI]

J. Ranjan Panigrahi

Autom., 2006

SPSA algorithms with measurement reuse.

[BibT_eX]

[DOI]

Proceedings of the Winter Simulation Conference WSC 2006, 2006

A Four-Timescale Algorithm for Constrained Stochastic Optimization of RED.

[BibT_eX]

[DOI]

Rajesh Kumar Patro

Proceedings of the 45th IEEE Conference on Decision and Control, 2006

A Reinforcement Learning Based Algorithm for Finite Horizon Markov Decision Processes.

[BibT_eX]

[DOI]

Proceedings of the 45th IEEE Conference on Decision and Control, 2006

2005

Adaptive multivariate three-timescale stochastic approximation algorithms for simulation based optimization.

[BibT_eX]

[DOI]

ACM Trans. Model. Comput. Simul., 2005

Optimal Threshold Policies for Admission Control in Communication Networks via Discrete Parameter Stochastic Approximation.

[BibT_eX]

[DOI]

I. Bala Bhaskar Reddy

Telecommun. Syst., 2005

A Discrete Parameter Stochastic Approximation Algorithm for Simulation Optimization.

[BibT_eX]

[DOI]

Hemant J. Kowshik

Simul., 2005

Overlap pattern synthesis with an efficient nearest neighbor classifier.

[BibT_eX]

[DOI]

Pattern Recognit., 2005

Uniqueness of Nonextensive entropy under Renyi's Recipe

[BibT_eX]

[DOI]

CoRR, 2005

Properties of Kullback-Leibler cross-entropy minimization in nonextensive framework.

[BibT_eX]

[DOI]

Narasimha Murty Musti

Proceedings of the 2005 IEEE International Symposium on Information Theory, 2005

Solution of Mdps Using Simulation-Based Value Iteration.

[BibT_eX]

[DOI]

Proceedings of the Artificial Intelligence Applications and Innovations - IFIP TC12 WG12.5, 2005

Information theoretic justification of Boltzmann selection and its generalization to Tsallis case.

[BibT_eX]

[DOI]

Proceedings of the IEEE Congress on Evolutionary Computation, 2005

2004

A simultaneous perturbation stochastic approximation-based actor-critic algorithm for Markov decision processes.

[BibT_eX]

[DOI]

Shishir Kumar

IEEE Trans. Autom. Control., 2004

Fusion of multiple approximate nearest neighbor classifiers for fast and efficient classification.

[BibT_eX]

[DOI]

Inf. Fusion, 2004

Generalized Evolutionary Algorithm based on Tsallis Statistics

[BibT_eX]

[DOI]

CoRR, 2004

A Pattern Synthesis Technique with an Efficient Nearest Neighbor Classifier for Binary Pattern Recognition.

[BibT_eX]

[DOI]

Proceedings of the 17th International Conference on Pattern Recognition, 2004

Cauchy annealing schedule: an annealing schedule for Boltzmann selection scheme in evolutionary algorithms.

[BibT_eX]

[DOI]

Proceedings of the IEEE Congress on Evolutionary Computation, 2004

Hierarchical decision making in semiconductor fabs using multi-time scale Markov decision processes.

[BibT_eX]

[DOI]

Jnana Ranjan Panigrahi

Proceedings of the 43rd IEEE Conference on Decision and Control, 2004

2003

Two-timescale simultaneous perturbation stochastic approximation using deterministic perturbation sequences.

[BibT_eX]

[DOI]

ACM Trans. Model. Comput. Simul., 2003

Multiscale Chaotic SPSA and Smoothed Functional Algorithms for Simulation Optimization.

[BibT_eX]

[DOI]

Simul., 2003

Quotient evolutionary space: abstraction of evolutionary process w.r.t macroscopic properties.

[BibT_eX]

[DOI]

Proceedings of the IEEE Congress on Evolutionary Computation, 2003

2002

A time aggregation approach to Markov decision processes.

[BibT_eX]

[DOI]

Autom., 2002

2001

Optimal structured feedback policies for ABR flow control using two-timescale SPSA.

[BibT_eX]

[DOI]

IEEE/ACM Trans. Netw., 2001

1995

A Convex Analytic Framework for Ergodic Control of Semi-Markov Processes.

[BibT_eX]

[DOI]