Xihuai Wang (王锡淮)

Ph.D. @ APEX Lab, Shanghai Jiao Tong University.

portrait.png

Shanghai, China

I received my Ph.D. from the APEX Lab at Shanghai Jiao Tong University in 2026, under the supervision of Prof. Weinan Zhang and Prof. Ying Wen, affiliated with both the SJTU-Apex Group and the SJTU-MARL Group. I was honored to be selected into the Wen-Tsun Wu AI Honorary Doctoral Program in 2020. Prior to this, I received my B.Eng. in Computer Science and Technology from the School of Computer Science and Engineering, Sun Yat-sen University in 2020.

My research lies at the intersection of Reinforcement Learning and Multi-Agent Learning. My current research focuses on:

  • Large Language Model Reasoning and Agency
    • Reinforcement learning methodologies for enhancing reasoning and agentic capabilities of LLMs
    • Human-AI collaborative decision-making with LLM-based agents
  • Multi-Agent Reinforcement Learning
    • Sample-efficient algorithms for cooperative MARL
    • Zero-shot coordination and generalization in multi-agent systems
🙋
I am on the 2025~2026 job market, seeking research and development positions in LLM reasoning and agent systems. Please feel free to reach out via leoxhwang@sjtu.edu.cn for potential opportunities.
本人正在寻求大语言模型推理及智能体系统相关的研发岗位。如有合适机会,欢迎通过 leoxhwang@sjtu.edu.cn 与我联系。

News

Jan 13, 2026
I have successfully defended my Ph.D. thesis and graduated from Shanghai Jiao Tong University! 🎉 Ph.D. Graduation
Dec 2, 2025 A blog post sharing my perspective on KL estimators in reinforcement learning.
Nov 23, 2025 A blog post sharing my perspective on training–inference mismatch in reinforcement learning for large language models.
May 16, 2025 Our paper Leveraging Dual Process Theory in Language Agent Framework for Real-time Simultaneous Human-AI Collaboration has been accepted to ACL 2025!
Sep 26, 2024 Our work about zero-shot coordination evaluation ZSC-Eval is accepted by NeurIPS 2024 Dataset and Benchmark Track!

Selected Papers

Full publications are available on Google Scholar or Publications (* denotes equal contribution).
Works are organized with respect to topics, including:
  1. Large Language Model Reasoning and Agency
  2. Multi-agent Reinforcement Learning
  1. Leveraging Dual Process Theory in Language Agent Framework for Real-time Simultaneous Human-AI Collaboration
    Shao Zhang*Xihuai Wang*, Wenhao Zhang, Chaoran Li, Junru Song, Tingyu Li, Lin Qiu, Xuezhi Cao, Xunliang Cai, Wen Yao, Weinan Zhang, Xinbing Wang, and Ying Wen
    63rd ACL, 2025
    LLM Reasoning and Agency
  2. ZSC-Eval: An Evaluation Toolkit and Benchmark for Multi-agent Zero-shot Coordination
    Xihuai WangShao Zhang, Wenhao Zhang, Wentao Dong, Jingxiao Chen, Ying Wen, and Weinan Zhang
    38th NeurIPS, 2024
    MARL Generalization
  3. Order Matters: Agent-by-agent Policy Optimization
    Xihuai Wang, Zheng Tian, Ziyu Wan, Ying WenJun Wang, and Weinan Zhang
    11th ICLR, 2023
    MARL Efficiency
  4. Model-based Multi-agent Policy Optimization with Adaptive Opponent-wise Rollouts
    Weinan ZhangXihuai Wang, Jian Shen, and Ming Zhou
    30th IJCAI, 2021
    MARL Efficiency