Skip to content

王锡淮Xihuai Wang

Ph.D. @ APEX Lab, Shanghai Jiao Tong University.

portrait.png

Shanghai, China

I received my Ph.D. from the APEX Lab at Shanghai Jiao Tong University in 2026, under the supervision of Prof. Weinan Zhang and Prof. Ying Wen, affiliated with both the SJTU-Apex Group and the SJTU-MARL Group. I was honored to be selected into the Wen-Tsun Wu AI Honorary Doctoral Program in 2020. Prior to this, I received my B.Eng. in Computer Science and Technology from the School of Computer Science and Engineering, Sun Yat-sen University in 2020.

My research lies at the intersection of Reinforcement Learning and Multi-Agent Learning. My current research focuses on:

  • Large Language Model Reasoning and Agency
    • Reinforcement learning methodologies for enhancing reasoning and agentic capabilities of LLMs
    • Human-AI collaborative decision-making with LLM-based agents
  • Multi-Agent Reinforcement Learning
    • Sample-efficient algorithms for cooperative MARL
    • Zero-shot coordination and generalization in multi-agent systems
Jan 13, 2026 Ph.D. Graduation I have successfully defended my Ph.D. thesis and graduated from Shanghai Jiao Tong University! 🎉
Dec 2, 2025 A blog post sharing my perspective on KL estimators in reinforcement learning.
Nov 23, 2025 A blog post sharing my perspective on training–inference mismatch in reinforcement learning for large language models.
May 16, 2025 Our paper Leveraging Dual Process Theory in Language Agent Framework for Real-time Simultaneous Human-AI Collaboration has been accepted to ACL 2025!
Sep 26, 2024 Our work about zero-shot coordination evaluation ZSC-Eval is accepted by NeurIPS 2024 Dataset and Benchmark Track!
Aug 8, 2023 Give a talk about cooperative multi-agent reinforcement learning (Coordinate Agents vis Policy Optimization) at RLChina
Mar 25, 2023 Our work about policy optimization in cooperative multi-agent scenarios Order Matters: Agent-by-agent Policy Optimization is accepted by ICLR 2023!
Full publications are available on Google Scholar or Publications (* denotes equal contribution). Works are organized by topic: LLM Reasoning and Agency, Multi-agent Reinforcement Learning.
  1. LLM Reasoning & Agency63rd ACL, 2025

    Leveraging Dual Process Theory in Language Agent Framework for Real-time Simultaneous Human-AI Collaboration

    Shao Zhang*Xihuai Wang*, Wenhao Zhang, Chaoran Li, Junru Song, Tingyu Li, Lin Qiu, Xuezhi Cao, Xunliang Cai, Wen Yao, Weinan Zhang, Xinbing Wang, and Ying Wen

  2. MARL Generalization38th NeurIPS, 2024

    ZSC-Eval: An Evaluation Toolkit and Benchmark for Multi-agent Zero-shot Coordination

    Xihuai WangShao Zhang, Wenhao Zhang, Wentao Dong, Jingxiao Chen, Ying Wen, and Weinan Zhang

  3. MARL Efficiency11th ICLR, 2023

    Order Matters: Agent-by-agent Policy Optimization

    Xihuai Wang, Zheng Tian, Ziyu Wan, Ying WenJun Wang, and Weinan Zhang

  4. MARL Efficiency30th IJCAI, 2021

    Model-based Multi-agent Policy Optimization with Adaptive Opponent-wise Rollouts

    Weinan ZhangXihuai Wang, Jian Shen, and Ming Zhou