Skip to content

王锡淮Xihuai Wang

Ph.D. @ APEX Lab, Shanghai Jiao Tong University.

portrait.png

Shanghai, China

I received my Ph.D. from the APEX Lab at Shanghai Jiao Tong University in 2026, under the supervision of Prof. Weinan Zhang and Prof. Ying Wen, affiliated with both the SJTU-Apex Group and the SJTU-MARL Group. I was honored to be selected into the Wen-Tsun Wu AI Honorary Doctoral Program in 2020. Prior to this, I received my B.Eng. in Computer Science and Technology from the School of Computer Science and Engineering, Sun Yat-sen University in 2020.

My research lies at the intersection of Reinforcement Learning and Multi-Agent Learning. My current research focuses on:

  • Large Language Model Reasoning and Agency
    • Reinforcement learning methodologies for enhancing reasoning and agentic capabilities of LLMs
    • Human-AI collaborative decision-making with LLM-based agents
  • Multi-Agent Reinforcement Learning
    • Sample-efficient algorithms for cooperative MARL
    • Zero-shot coordination and generalization in multi-agent systems

News

Jan 13, 2026 Ph.D. Graduation I have successfully defended my Ph.D. thesis and graduated from Shanghai Jiao Tong University! 🎉
Dec 2, 2025 A blog post sharing my perspective on KL estimators in reinforcement learning.
Nov 23, 2025 A blog post sharing my perspective on training–inference mismatch in reinforcement learning for large language models.
May 16, 2025 Our paper Leveraging Dual Process Theory in Language Agent Framework for Real-time Simultaneous Human-AI Collaboration has been accepted to ACL 2025!
Sep 26, 2024 Our work about zero-shot coordination evaluation ZSC-Eval is accepted by NeurIPS 2024 Dataset and Benchmark Track!
Aug 8, 2023 Give a talk about cooperative multi-agent reinforcement learning (Coordinate Agents vis Policy Optimization) at RLChina
Mar 25, 2023 Our work about policy optimization in cooperative multi-agent scenarios Order Matters: Agent-by-agent Policy Optimization is accepted by ICLR 2023!

Selected Papers

Full publications are available on Google Scholar or Publications (* denotes equal contribution).
Works are organized with respect to topics, including:
  1. Large Language Model Reasoning and Agency
  2. Multi-agent Reinforcement Learning
  1. LLM Reasoning & Agency
    63rd ACL, 2025
    Leveraging Dual Process Theory in Language Agent Framework for Real-time Simultaneous Human-AI Collaboration
    Shao Zhang*Xihuai Wang*, Wenhao Zhang, Chaoran Li, Junru Song, Tingyu Li, Lin Qiu, Xuezhi Cao, Xunliang Cai, Wen Yao, Weinan Zhang, Xinbing Wang, and Ying Wen
  2. MARL Generalization
    38th NeurIPS, 2024
    ZSC-Eval: An Evaluation Toolkit and Benchmark for Multi-agent Zero-shot Coordination
    Xihuai WangShao Zhang, Wenhao Zhang, Wentao Dong, Jingxiao Chen, Ying Wen, and Weinan Zhang
  3. MARL Efficiency
    11th ICLR, 2023
    Order Matters: Agent-by-agent Policy Optimization
    Xihuai Wang, Zheng Tian, Ziyu Wan, Ying WenJun Wang, and Weinan Zhang
  4. MARL Efficiency
    30th IJCAI, 2021
    Model-based Multi-agent Policy Optimization with Adaptive Opponent-wise Rollouts
    Weinan ZhangXihuai Wang, Jian Shen, and Ming Zhou