Soichiro Nishimori

According to our database1, Soichiro Nishimori authored at least 13 papers between 2022 and 2026.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
Finite-Time Regret Analysis of Retry-Aware Bandits.
CoRR, May, 2026

Mahjax: A GPU-Accelerated Mahjong Simulator for Reinforcement Learning in JAX.
CoRR, May, 2026

Mitigating Reward Hacking in RLHF via Advantage Sign Robustness.
CoRR, April, 2026

On Symmetric Losses for Policy Optimization with Noisy Preferences.
Trans. Mach. Learn. Res., 2026

2025
Recursive Reward Aggregation.
CoRR, July, 2025

On Symmetric Losses for Robust Policy Optimization with Noisy Preferences.
CoRR, May, 2025

2024
Leveraging Domain-Unlabeled Data in Offline Reinforcement Learning across Two Domains.
CoRR, 2024

A Policy Gradient Primal-Dual Algorithm for Constrained MDPs with Uniform PAC Guarantees.
CoRR, 2024

A Batch Sequential Halving Algorithm without Performance Degradation.
RLJ, 2024

2023
End-to-End Policy Gradient Method for POMDPs and Explainable Agents.
CoRR, 2023

Pgx: Hardware-accelerated parallel game simulation for reinforcement learning.
CoRR, 2023

Pgx: Hardware-Accelerated Parallel Game Simulators for Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

2022
Mjx: A framework for Mahjong AI research.
Proceedings of the IEEE Conference on Games, CoG 2022, Beijing, 2022


  Loading...