Bei Peng

Orcid: 0000-0003-0152-3180

Affiliations:

University of Liverpool, UK
University of Oxford, UK (former)
Washington State University, Pullman, WA, USA (PhD 2018)

According to our database¹, Bei Peng authored at least 39 papers between 2012 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Bibliography

2025

Rethinking Cross-Generator Image Forgery Detection through DINOv3.

[BibT_eX]

[DOI]

CoRR, November, 2025

Heuristic Transformer: Belief Augmented In-Context Reinforcement Learning.

[BibT_eX]

[DOI]

Oliver Dippel

Alexei Lisitsa

Bei Peng

CoRR, November, 2025

So-Fake: Benchmarking and Explaining Social Media Image Forgery Detection.

[BibT_eX]

[DOI]

CoRR, May, 2025

SIDA: Social Media Image Deepfake Detection, Localization and Explanation with Large Multimodal Model.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024

Contextual Transformers for Goal-Oriented Reinforcement Learning.

[BibT_eX]

[DOI]

Oliver Dippel

Alexei Lisitsa

Bei Peng

Proceedings of the Artificial Intelligence XLI, 2024

Accelerating Laboratory Automation Through Robot Skill Learning For Sample Scraping.

[BibT_eX]

[DOI]

Proceedings of the 20th IEEE International Conference on Automation Science and Engineering, 2024

2023

Deep Reinforcement Learning for Continuous Control of Material Thickness.

[BibT_eX]

[DOI]

Oliver Dippel

Alexei Lisitsa

Bei Peng

Proceedings of the Artificial Intelligence XL, 2023

2022

Curriculum Learning for Relative Overgeneralization.

[BibT_eX]

[DOI]

Lin Shi

Bei Peng

CoRR, 2022

Dependable learning-enabled multiagent systems.

[BibT_eX]

[DOI]

Xiaowei Huang

Bei Peng

Xingyu Zhao

AI Commun., 2022

2021

Special issue on adaptive and learning agents 2018.

[BibT_eX]

[DOI]

Knowl. Eng. Rev., 2021

Semi-On-Policy Training for Sample Efficient Multi-Agent Policy Gradients.

[BibT_eX]

[DOI]

CoRR, 2021

Softmax with Regularization: Better Value Estimation in Multi-Agent Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2021

FACMAC: Factored Multi-Agent Centralised Policy Gradients.

[BibT_eX]

[DOI]

Bei Peng

Tabish Rashid

Christian Schröder de Witt

Pierre-Alexandre Kamienny

Philip H. S. Torr

Wendelin Boehmer

Shimon Whiteson

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Regularized Softmax Deep Multi-Agent Q-Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Randomized Entity-wise Factorization for Multi-Agent Reinforcement Learning.

[BibT_eX]

[DOI]

Shariq Iqbal

Christian A. Schröder de Witt

Proceedings of the 38th International Conference on Machine Learning, 2021

UneVEn: Universal Value Exploration for Multi-Agent Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

RODE: Learning Roles to Decompose Multi-Agent Tasks.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

2020

Special issue on adaptive and learning agents 2019.

[BibT_eX]

[DOI]

Knowl. Eng. Rev., 2020

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2020

Weighted QMIX: Expanding Monotonic Value Function Factorisation.

[BibT_eX]

[DOI]

CoRR, 2020

AI-QMIX: Attention and Imagination for Dynamic Multi-Agent Reinforcement Learning.

[BibT_eX]

[DOI]

Shariq Iqbal

Christian A. Schröder de Witt

CoRR, 2020

Deep Multi-Agent Reinforcement Learning for Decentralized Continuous Cooperative Control.

[BibT_eX]

[DOI]

Christian Schröder de Witt

Bei Peng

Pierre-Alexandre Kamienny

Philip H. S. Torr

Wendelin Böhmer

Shimon Whiteson

CoRR, 2020

Weighted QMIX: Expanding Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Optimistic Exploration even with a Pessimistic Initialisation.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Learning Representations, 2020

2019

VIABLE: Fast Adaptation via Backpropagating Learned Loss.

[BibT_eX]

[DOI]

CoRR, 2019

Interactive Learning of Environment Dynamics for Sequential Tasks.

[BibT_eX]

[DOI]

CoRR, 2019

2017

Interactive Learning from Policy-Dependent Human Feedback.

[BibT_eX]

[DOI]

CoRR, 2017

Interactive Learning from Policy-Dependent Human Feedback.

[BibT_eX]

[DOI]

Proceedings of the 34th International Conference on Machine Learning, 2017

Curriculum Design for Machine Learners in Sequential Decision Tasks.

[BibT_eX]

[DOI]

Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems, 2017

How Do Humans Teach: On Curriculum Design for Machine Learners.

[BibT_eX]

[DOI]

Bei Peng

Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems, 2017

2016

Learning behaviors via human-delivered discrete feedback: modeling implicit feedback strategies to speed up learning.

[BibT_eX]

[DOI]

Auton. Agents Multi Agent Syst., 2016

A Need for Speed: Adapting Agent Action Speed to Improve Task Learning from Non-Expert Humans.

[BibT_eX]

[DOI]

Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems, 2016

Towards Behavior-Aware Model Learning from Human-Generated Trajectories.

[BibT_eX]

[DOI]

Proceedings of the 2016 AAAI Fall Symposia, Arlington, Virginia, USA, November 17-19, 2016, 2016

2015

Towards Integrating Real-Time Crowd Advice with Reinforcement Learning.

[BibT_eX]

[DOI]

Gabriel Victor de la Cruz

Bei Peng

Walter S. Lasecki

Matthew E. Taylor

Proceedings of the 20th International Conference on Intelligent User Interfaces Companion, 2015

On the Ability to Provide Demonstrations on a UAS: Observing 90 Untrained Participants Abusing a Flying Robot.

[BibT_eX]

[DOI]

Proceedings of the 2015 AAAI Fall Symposia, Arlington, Virginia, USA, November 12-14, 2015, 2015

Generating Real-Time Crowd Advice to Improve Reinforcement Learning Agents.

[BibT_eX]

[DOI]

Gabriel Victor de la Cruz

Bei Peng

Walter Stephen Lasecki

Matthew Edmund Taylor

Proceedings of the Learning for General Competency in Video Games, 2015

2014

Learning something from nothing: Leveraging implicit human feedback strategies.

[BibT_eX]

[DOI]

Proceedings of the 23rd IEEE International Symposium on Robot and Human Interactive Communication, 2014

A Strategy-Aware Technique for Learning Behaviors from Discrete Human Feedback.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, 2014

2012

A GPU-Based Accelerator for Chinese Word Segmentation.

[BibT_eX]

[DOI]

Proceedings of the Web Technologies and Applications - 14th Asia-Pacific Web Conference, 2012

Bei Peng

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...