Chenjia Bai

Orcid: 0000-0002-8379-9385

According to our database¹, Chenjia Bai authored at least 76 papers between 2019 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Bibliography

2026

Temporal consistent multi-view perception for robust embodied manipulation.

[BibT_eX]

[DOI]

Pattern Recognit., 2026

2025

Towards Reliable LLM-based Robot Planning via Combined Uncertainty Estimation.

[BibT_eX]

[DOI]

CoRR, October, 2025

KungfuBot2: Learning Versatile Motion Skills for Humanoid Whole-Body Control.

[BibT_eX]

[DOI]

CoRR, September, 2025

Align-Then-stEer: Adapting the Vision-Language Action Models through Unified Latent Guidance.

[BibT_eX]

[DOI]

CoRR, September, 2025

On the Value of Myopic Behavior in Policy Reuse.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., August, 2025

HumanoidGen: Data Generation for Bimanual Dexterous Manipulation via LLM Reasoning.

[BibT_eX]

[DOI]

CoRR, July, 2025

Skill-Nav: Enhanced Navigation with Versatile Quadrupedal Locomotion via Waypoint Interface.

[BibT_eX]

[DOI]

CoRR, June, 2025

Unsupervised Skill Discovery through Skill Regions Differentiation.

[BibT_eX]

[DOI]

CoRR, June, 2025

KungfuBot: Physics-Based Humanoid Whole-Body Control for Learning Highly-Dynamic Skills.

[BibT_eX]

[DOI]

CoRR, June, 2025

MoRE: Mixture of Residual Experts for Humanoid Lifelike Gaits Learning on Complex Terrains.

[BibT_eX]

[DOI]

CoRR, June, 2025

Learn as Individuals, Evolve as a Team: Multi-agent LLMs Adaptation in Embodied Environments.

[BibT_eX]

[DOI]

CoRR, June, 2025

Towards a Generalizable Bimanual Foundation Policy via Flow-based Video Prediction.

[BibT_eX]

[DOI]

CoRR, May, 2025

Revisiting Multi-Agent World Modeling from a Diffusion-Inspired Perspective.

[BibT_eX]

[DOI]

CoRR, May, 2025

Online Iterative Self-Alignment for Radiology Report Generation.

[BibT_eX]

[DOI]

CoRR, May, 2025

Adversarial Locomotion and Motion Imitation for Humanoid Policy Learning.

[BibT_eX]

[DOI]

CoRR, April, 2025

Information-Theoretic Reward Decomposition for Generalizable RLHF.

[BibT_eX]

[DOI]

CoRR, April, 2025

Humanoid Whole-Body Locomotion on Narrow Terrain via Dynamic Balance and Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, February, 2025

VLP: Vision-Language Preference Learning for Embodied Manipulation.

[BibT_eX]

[DOI]

CoRR, February, 2025

Decentralized Transformers with Centralized Aggregation are Sample-Efficient Multi-Agent World Models.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2025

Skill matters: Dynamic skill learning for multi-agent cooperative reinforcement learning.

[BibT_eX]

[DOI]

Neural Networks, 2025

Combining long and short spatiotemporal reasoning for deep reinforcement learning.

[BibT_eX]

[DOI]

Huiling Liu

Peng Liu

Chenjia Bai

Neurocomputing, 2025

Provably efficient information-directed sampling algorithms for multi-agent reinforcement learning.

[BibT_eX]

[DOI]

Artif. Intell., 2025

Task-Agnostic Pre-training and Task-Guided Fine-tuning for Versatile Diffusion Planner.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

Discriminator-Guided Embodied Planning for LLM Agent.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Exponential Topology-enabled Scalable Communication in Multi-agent Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Online Preference Alignment for Language Models via Count-based Exploration.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Towards Efficient LLM Grounding for Embodied Multi-Agent Collaboration.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2025

Online Iterative Self-Alignment for Radiology Report Generation.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

Radiology Report Generation via Multi-objective Preference Optimization.

[BibT_eX]

[DOI]

Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

Forward KL Regularized Preference Optimization for Aligning Diffusion Policies.

[BibT_eX]

[DOI]

Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024

Exploration in Deep Reinforcement Learning: From Single-Agent to Multiagent Domain.

[BibT_eX]

[DOI]

IEEE Trans. Neural Networks Learn. Syst., July, 2024

Monotonic Quantile Network for Worst-Case Offline Reinforcement Learning.

[BibT_eX]

[DOI]

IEEE Trans. Neural Networks Learn. Syst., July, 2024

False Correlation Reduction for Offline Reinforcement Learning.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., February, 2024

Pessimistic value iteration for multi-task data sharing in Offline Reinforcement Learning.

[BibT_eX]

[DOI]

Artif. Intell., January, 2024

Towards Robust Offline-to-Online Reinforcement Learning via Uncertainty and Smoothness.

[BibT_eX]

[DOI]

J. Artif. Intell. Res., 2024

Diverse randomized value functions: A provably pessimistic approach for offline reinforcement learning.

[BibT_eX]

[DOI]

Inf. Sci., 2024

Radiology Report Generation via Multi-objective Preference Optimization.

[BibT_eX]

[DOI]

CoRR, 2024

Preference Aligned Diffusion Planner for Quadrupedal Locomotion Control.

[BibT_eX]

[DOI]

CoRR, 2024

SAM-E: Leveraging Visual Foundation Model with Sequence Imitation for Embodied Manipulation.

[BibT_eX]

[DOI]

CoRR, 2024

Diverse Randomized Value Functions: A Provably Pessimistic Approach for Offline Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2024

Large-Scale Actionless Video Pre-Training via Discrete Diffusion for Efficient Policy Learning.

[BibT_eX]

[DOI]

CoRR, 2024

Ensemble successor representations for task generalization in offline-to-online reinforcement learning.

[BibT_eX]

[DOI]

Sci. China Inf. Sci., 2024

Regularized Conditional Diffusion Model for Multi-Task Preference Alignment.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

ODRL: A Benchmark for Off-Dynamics Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Learning an Actionable Discrete Diffusion Policy via Large-Scale Actionless Video Pre-Training.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Robust Quadrupedal Locomotion via Risk-Averse Policy Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Robotics and Automation, 2024

How Does Goal Relabeling Improve Sample Efficiency?

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

SAM-E: Leveraging Visual Foundation Model with Sequence Imitation for Embodied Manipulation.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Contrastive Representation for Data Filtering in Cross-Domain Offline Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Cross-Domain Policy Adaptation by Capturing Representation Mismatch.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Constrained Ensemble Exploration for Unsupervised Skill Discovery.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

SelfBC: Self Behavior Cloning for Offline Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the ECAI 2024 - 27th European Conference on Artificial Intelligence, 19-24 October 2024, Santiago de Compostela, Spain, 2024

Bridging the Sim-to-Real Gap from the Information Bottleneck Perspective.

[BibT_eX]

[DOI]

Proceedings of the Conference on Robot Learning, 6-9 November 2024, Munich, Germany., 2024

OVD-Explorer: Optimism Should Not Be the Sole Pursuit of Exploration in Noisy Environments.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

Self-Supervised Imitation for Offline Reinforcement Learning With Hindsight Relabeling.

[BibT_eX]

[DOI]

IEEE Trans. Syst. Man Cybern. Syst., December, 2023

Variational Dynamic for Self-Supervised Exploration in Deep Reinforcement Learning.

[BibT_eX]

[DOI]

IEEE Trans. Neural Networks Learn. Syst., August, 2023

Addressing Hindsight Bias in Multigoal Reinforcement Learning.

[BibT_eX]

[DOI]

IEEE Trans. Cybern., 2023

Towards Robust Offline-to-Online Reinforcement Learning via Uncertainty and Smoothness.

[BibT_eX]

[DOI]

CoRR, 2023

Robust Quadrupedal Locomotion via Risk-Averse Policy Learning.

[BibT_eX]

[DOI]

CoRR, 2023

Privileged Knowledge Distillation for Sim-to-Real Policy Generalization.

[BibT_eX]

[DOI]

CoRR, 2023

On the Value of Myopic Behavior in Policy Reuse.

[BibT_eX]

[DOI]

CoRR, 2023

Cross-Domain Policy Adaptation via Value-Guided Data Filtering.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Diffusion Model is an Effective Planner and Data Synthesizer for Multi-Task Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Behavior Contrastive Learning for Unsupervised Skill Discovery.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

2022

RORL: Robust Offline Reinforcement Learning via Conservative Smoothing.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

2021

SCORE: Spurious COrrelation REduction for Offline Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2021

Exploration in Deep Reinforcement Learning: A Comprehensive Survey.

[BibT_eX]

[DOI]

CoRR, 2021

Dynamic Bottleneck for Robust Self-Supervised Exploration.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Principled Exploration via Optimistic Bootstrapping and Backward Induction.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

2020

Obtaining accurate estimated action values in categorical distributional reinforcement learning.

[BibT_eX]

[DOI]

Knowl. Based Syst., 2020

Generating attentive goals for prioritized hindsight reinforcement learning.

[BibT_eX]

[DOI]

Knowl. Based Syst., 2020

深度强化学习中稀疏奖励问题研究综述 (Survey on Sparse Reward in Deep Reinforcement Learning).

[BibT_eX]

[DOI]

计算机科学, 2020

Variational Dynamic for Self-Supervised Exploration in Deep Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2020

2019

Guided goal generation for hindsight multi-goal reinforcement learning.

[BibT_eX]

[DOI]

Neurocomputing, 2019

Chenjia Bai

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...