Skip to content

Publications

A record of collaborative work.

Peer-reviewed papers and preprints, organized by year. * denotes equal contribution.

Full list also on Google Scholar.

2026 1 paper

  1. LLM Reasoning & Agency14th ICLR, 2026

    SRFT: A Single-Stage Method with Supervised and Reinforcement Fine-Tuning for Reasoning

    Yuqian Fu, Tinghong Chen, Jiajun Chai, Xihuai Wang, Songjun Tu, Guojun Yin, Wei Lin, Qichao Zhang, Yuanheng Zhu, and Dongbin Zhao

2025 2 papers

  1. LLM Reasoning & Agency63rd ACL, 2025

    Leveraging Dual Process Theory in Language Agent Framework for Real-time Simultaneous Human-AI Collaboration

    Shao Zhang*Xihuai Wang*, Wenhao Zhang, Chaoran Li, Junru Song, Tingyu Li, Lin Qiu, Xuezhi Cao, Xunliang Cai, Wen Yao, Weinan Zhang, Xinbing Wang, and Ying Wen

  2. MARL Efficiency24th AAMAS, 2025

    PMAT: Optimizing Action Generation Order in Multi-Agent Reinforcement Learning

    Kun Hu, Muning Wen, Xihuai WangShao Zhang, Yiwei Shi, Minne Li, Minglong Li, and Ying Wen

2024 2 papers

  1. LLM Reasoning & AgencyPreprint Under Review, 2024

    Mutual Theory of Mind in Human-AI Collaboration: An Empirical Study with LLM-driven AI Agents in a Real-time Shared Workspace Task

    Shao Zhang*Xihuai Wang*, Wenhao Zhang, Yongshan Chen, Landi Gao, Dakuo Wang, Weinan Zhang, Xinbing Wang, and Ying Wen

  2. MARL Generalization38th NeurIPS, 2024

    ZSC-Eval: An Evaluation Toolkit and Benchmark for Multi-agent Zero-shot Coordination

    Xihuai WangShao Zhang, Wenhao Zhang, Wentao Dong, Jingxiao Chen, Ying Wen, and Weinan Zhang

2023 1 paper

  1. MARL Efficiency11th ICLR, 2023

    Order Matters: Agent-by-agent Policy Optimization

    Xihuai Wang, Zheng Tian, Ziyu Wan, Ying WenJun Wang, and Weinan Zhang

2022 1 paper

  1. MARL EfficiencyarXiv preprint arXiv:2203.10603, 2022

    Model-based Multi-agent Reinforcement Learning: Recent Progress and Prospects

    Xihuai Wang, Zhicheng Zhang, and Weinan Zhang

2021 1 paper

  1. MARL Efficiency30th IJCAI, 2021

    Model-based Multi-agent Policy Optimization with Adaptive Opponent-wise Rollouts

    Weinan ZhangXihuai Wang, Jian Shen, and Ming Zhou