Siliang Zeng

Orcid: 0009-0002-5863-8659

According to our database1, Siliang Zeng authored at least 23 papers between 2006 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
Aligning Frozen LLMs by Reinforcement Learning: An Iterative Reweight-then-Optimize Approach.
CoRR, June, 2025

Reinforcing Multi-Turn Reasoning in LLM Agents via Turn-Level Credit Assignment.
CoRR, May, 2025

From Demonstrations to Rewards: Alignment Without Explicit Human Preferences.
CoRR, March, 2025

Bridging the Training-Inference Gap in LLMs by Leveraging Self-Generated Tokens.
Trans. Mach. Learn. Res., 2025

Structural Estimation of Markov Decision Processes in High-Dimensional State Space with Finite-Time Guarantees.
Oper. Res., 2025

Joint Reward and Policy Learning with Demonstrations and Human Feedback Improves Alignment.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Understanding Inverse Reinforcement Learning under Overparameterization: Non-Asymptotic Analysis and Global Optimality.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2025

2024
Joint Demonstration and Preference Learning Improves Policy Alignment with Human Feedback.
CoRR, 2024

Getting More Juice Out of the SFT Data: Reward Learning from Human Demonstration Improves SFT for LLM Alignment.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

2023
Understanding Expertise through Demonstrations: A Maximum Likelihood Framework for Offline Inverse Reinforcement Learning.
CoRR, 2023

When Demonstrations meet Generative World Models: A Maximum Likelihood Framework for Offline Inverse Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

ProfLLM: A framework for adapting offline large language models to few-shot expert knowledge.
Proceedings of the 4th International Conference on Artificial Intelligence and Computer Engineering, 2023

A Bayesian Approach to Robust Inverse Reinforcement Learning.
Proceedings of the Conference on Robot Learning, 2023

2022
On the Divergence of Decentralized Nonconvex Optimization.
SIAM J. Optim., December, 2022

Maximum-Likelihood Inverse Reinforcement Learning with Finite-Time Guarantees.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

A Stochastic Linearized Augmented Lagrangian Method for Decentralized Bilevel Optimization.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Learning to Coordinate in Multi-Agent Systems: A Coordinated Actor-Critic Algorithm and Finite-Time Guarantees.
Proceedings of the Learning for Dynamics and Control Conference, 2022

2021
A Momentum-Assisted Single-Timescale Stochastic Approximation Algorithm for Bilevel Optimization.
CoRR, 2021

A Near-Optimal Algorithm for Stochastic Bilevel Optimization via Double-Momentum.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

2020
On the Divergence of Decentralized Non-Convex Optimization.
CoRR, 2020

Multi-Agent Reinforcement Learning for Adaptive Routing: A Hybrid Method using Eligibility Traces.
Proceedings of the 16th IEEE International Conference on Control & Automation, 2020

Network-Level System Performance Prediction Using Deep Neural Networks with Cross-Layer Information.
Proceedings of the 2020 IEEE International Conference on Communications, 2020

2006
A new In-network data aggregation technology of wireless sensor networks.
Proceedings of the 2006 International Conference on Semantics, 2006


  Loading...