We stand with Ukraine

We stand with Ukraine

Tadashi Kozuno

Orcid: 0000-0002-8820-1362

According to our database¹, Tadashi Kozuno authored at least 30 papers between 2017 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2024

Symmetry-aware Reinforcement Learning for Robotic Assembly under Partial Observability with a Soft Wrist.

[BibT_eX]

[DOI]

,

,

Cristian C. Beltran-Hernandez

,

CoRR, 2024

A Policy Gradient Primal-Dual Algorithm for Constrained MDPs with Uniform PAC Guarantees.

[BibT_eX]

[DOI]

Toshinori Kitamura

,

,

,

,

Soichiro Nishimori

,

Akiyoshi Sannai

,

,

,

CoRR, 2024

2023

Benchmarking Actor-Critic Deep Reinforcement Learning Algorithms for Robotics Control With Action Constraints.

[BibT_eX]

[DOI]

,

,

,

,

,

IEEE Robotics Autom. Lett., 2023

Multi-Agent Behavior Retrieval: Retrieval-Augmented Policy Training for Cooperative Manipulation by Mobile Robots.

[BibT_eX]

[DOI]

,

,

CoRR, 2023

Local and adaptive mirror descents in extensive-form games.

[BibT_eX]

[DOI]

,

,

,

,

Vianney Perchet

,

CoRR, 2023

When to Replan? An Adaptive Replanning Strategy for Autonomous Navigation using Deep Reinforcement Learning.

[BibT_eX]

[DOI]

,

,

,

CoRR, 2023

Avoiding Model Estimation in Robust Markov Decision Processes with a Generative Model.

[BibT_eX]

[DOI]

,

,

,

Scott M. Jordan

,

CoRR, 2023

DoMo-AC: Doubly Multi-step Off-policy Actor-Critic Algorithm.

[BibT_eX]

[DOI]

,

,

,

Anna Harutyunyan

,

,

Bernardo Ávila Pires

,

Proceedings of the International Conference on Machine Learning, 2023

Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice.

[BibT_eX]

[DOI]

Toshinori Kitamura

,

,

,

,

,

,

,

,

Mohammad Gheshlaghi Azar

,

,

Olivier Pietquin

,

,

Csaba Szepesvári

,

,

Proceedings of the International Conference on Machine Learning, 2023

Adapting to game trees in zero-sum imperfect information games.

[BibT_eX]

[DOI]

,

,

,

,

Vianney Perchet

,

Proceedings of the International Conference on Machine Learning, 2023

Counterfactual Fairness Filter for Fair-Delay Multi-Robot Navigation.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, 2023

2022

No More Pesky Hyperparameters: Offline Hyperparameter Tuning for RL.

[BibT_eX]

[DOI]

,

Archit Sakhadeo

,

,

,

,

,

,

,

,

Trans. Mach. Learn. Res., 2022

Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences.

[BibT_eX]

[DOI]

,

,

,

,

A. Rupam Mahmood

,

J. Mach. Learn. Res., 2022

Confident Approximate Policy Iteration for Efficient Local Planning in q<sup>π</sup>-realizable MDPs.

[BibT_eX]

[DOI]

,

András György

,

,

Csaba Szepesvári

CoRR, 2022

KL-Entropy-Regularized RL with a Generative Model is Minimax Optimal.

[BibT_eX]

[DOI]

,

,

,

Toshinori Kitamura

,

,

,

,

Mohammad Gheshlaghi Azar

,

,

,

Olivier Pietquin

,

,

Csaba Szepesvári

CoRR, 2022

Deep Learning-based Nonlinear Quantizer for Fronthaul Compression.

[BibT_eX]

[DOI]

Shinnosuke Yagi

,

Mutsuki Nakahara

,

,

,

Proceedings of the 2022 27th OptoElectronics and Communications Conference (OECC) and 2022 International Conference on Photonics in Switching and Computing (PSC), 2022

Confident Approximate Policy Iteration for Efficient Local Planning in $q^\pi$-realizable MDPs.

[BibT_eX]

[DOI]

,

András György

,

,

Csaba Szepesvári

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Variational oracle guiding for reinforcement learning.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the Tenth International Conference on Learning Representations, 2022

2021

Model-Free Learning for Two-Player Zero-Sum Partially Observable Markov Games with Perfect Recall.

[BibT_eX]

[DOI]

,

,

,

CoRR, 2021

Identifying Co-Adaptation of Algorithmic and Implementational Innovations in Deep Reinforcement Learning: A Taxonomy and Case Study of Inference-based Algorithms.

[BibT_eX]

[DOI]

,

,

Tatsuya Matsushima

,

,

Shixiang Shane Gu

CoRR, 2021

Unifying Gradient Estimators for Meta-Reinforcement Learning via Off-Policy Evaluation.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Learning in two-player zero-sum partially observable Markov games with perfect recall.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Co-Adaptation of Algorithmic and Implementational Innovations in Inference-based Deep Reinforcement Learning.

[BibT_eX]

[DOI]

,

,

Tatsuya Matsushima

,

,

Shixiang Shane Gu

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Revisiting Peng's Q(λ) for Modern Reinforcement Learning.

[BibT_eX]

[DOI]

,

,

,

,

Steven Kapturowski

,

,

,

Proceedings of the 38th International Conference on Machine Learning, 2021

Policy Information Capacity: Information-Theoretic Measure for Task Complexity in Deep Reinforcement Learning.

[BibT_eX]

[DOI]

,

Tatsuya Matsushima

,

,

,

,

,

Shixiang Shane Gu

Proceedings of the 38th International Conference on Machine Learning, 2021

2020

Leverage the Average: an Analysis of Regularization in RL.

[BibT_eX]

[DOI]

,

,

,

Olivier Pietquin

,

,

CoRR, 2020

Leverage the Average: an Analysis of KL Regularization in Reinforcement Learning.

[BibT_eX]

[DOI]

,

,

,

Olivier Pietquin

,

,

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

2019

Gap-Increasing Policy Evaluation for Efficient and Noise-Tolerant Reinforcement Learning.

[BibT_eX]

[DOI]

,

,

CoRR, 2019

Theoretical Analysis of Efficiency and Robustness of Softmax and Gap-Increasing Operators in Reinforcement Learning.

[BibT_eX]

[DOI]

,

,

Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

2017

Unifying Value Iteration, Advantage Learning, and Dynamic Policy Programming.

[BibT_eX]

[DOI]

,

,

CoRR, 2017

Loading...