Welcome to Awesome-Human-Agent-Collaboration-Interaction-Systems! 🚀 This is the repo for our Survey on LLM-Based Human-Agent Collaboration and Interaction Systems.
Recent advances in large language models (LLMs) have sparked growing interest in building fully autonomous agents. However, fully autonomous LLM-based agents still face significant challenges, including (1) limited reliability due to hallucinations, (2) difficulty in handling complex tasks, and (3) substantial safety and ethical risks, all of which limit their feasibility and trustworthiness in real-world applications.
LLM-based human-agent collaboration systems are interactive frameworks where humans actively provide (1) additional information, (2) feedback, or (3) control during interaction with LLM-powered agents to enhance system performance, reliability, and safety. These human-agent collaboration systems enable humans and LLM-based agents to collaborate effectively by leveraging their complementary strengths. For a detailed introduction, please refer to our survey paper: LLM-Based Human-Agent Collaboration and Interaction Systems: A Survey.
Our goal with this project is to build an exhaustive collection of awesome resources relevant to LLM-Based Human-Agent/AI Collaboration and Interaction Systems, encompassing papers, repositories, and more to foster further research and innovation in this rapidly evolving interdisciplinary field of human-ai collaboration. 🤗 Contributions are welcome! 🤗 If you have recommended papers, resources or suggestions, please submit pull requests, open issues or contact us. We will keep updating our repo & survey paper.
(©️click here back to table of contents👆🏻)
🤗 Contributions are welcome! If you have recommended papers and resources, please submit pull requests or open issues.
-
[1 Apr 2026] [arXiv 2026] When Users Change Their Mind: Evaluating Interruptible Agents in Long-Horizon Web Navigation
-
[30 Mar 2026] [arXiv 2026] ViviDoc: Generating Interactive Documents through Human-Agent Collaboration
-
[18 Feb 2026] [arXiv 2026] Learning Personalized Agents from Human Feedback
-
[30 Nov 2025] [arXiv 2025] HAI-Eval: Measuring Human-AI Synergy in Collaborative Coding
-
[4 Nov 2025] [arXiv 2025] Training Proactive and Personalized LLM Agents
-
[15 Oct 2025] [arXiv 2025] Training LLM Agents to Empower Humans
-
[10 Oct 2025] [arXiv 2025] How can we assess human-agent interactions? Case studies in software agent design
-
[7 Oct 2025] [arXiv 2025] RECODE-H: A Benchmark for Research Code Development with Interactive Human Feedback
-
[24 Sep 2025] [arXiv 2025] UserRL: Training Proactive User-Centric Agent via Reinforcement Learning
-
[26 Aug 2025] [arXiv 2025] MUA-RL: Multi-turn User-interacting Agent Reinforcement Learning for agentic tool use
-
[20 Aug 2025] [arXiv 2025] aiXiv: A Next-Generation Open Access Ecosystem for Scientific Discovery Generated by AI Scientists
-
[31 Jul 2025] [arXiv 2025] MemoCue: Empowering LLM-Based Agents for Human Memory Recall via Strategy-Guided Querying
-
[30 Jul 2025] [arXiv 2025] Magentic-UI: Towards Human-in-the-loop Agentic Systems
-
[29 Jul 2025] [arXiv 2025] UserBench: An Interactive Gym Environment for User-Centric Agents
-
[28 Jul 2025] [arXiv 2025] GenoMAS: A Multi-Agent Framework for Scientific Discovery via Code-Driven Gene Expression Analysis
-
[23 Jul 2025] [arXiv 2025] Enabling Self-Improving Agents to Learn at Test Time With Human-In-The-Loop Guidance
-
[21 Jul 2025] [arXiv 2025] Interaction as Intelligence: Deep Research With Human-AI Partnership
-
[13 Jun 2025] [arXiv 2025] Interaction, Process, Infrastructure: A Unified Architecture for Human-Agent Collaboration
-
[11 Jun 2025] [arXiv 2025] A Call for Collaborative Intelligence: Why Human-Agent Systems Should Precede AI Autonomy
-
[9 Jun 2025] [arXiv 2025] τ2-Bench: Evaluating Conversational Agents in a Dual-Control Environment
-
[6 Jun 2025] [arXiv 2025] Future of Work with AI Agents: Auditing Automation and Augmentation Potential across the U.S. Workforce
-
[24 May 2025] [ICLR 2025] Learning to Clarify: Multi-turn Conversations with Action-Based Contrastive Self-Training [Code]
-
[23 May 2025] [arXiv 2025] Collaborative Memory: Multi-User Memory Sharing in LLM Agents with Dynamic Access Control
-
[21 May 2025] [arXiv 2025] Prototypical Human-AI Collaboration Behaviors from LLM-Assisted Writing in the Wild
-
[16 May 2025] [arXiv 2025] XtraGPT: LLMs for Human-AI Collaboration on Controllable Academic Paper Revision
-
[5 May 2025] [arXiv 2025] SymbioticRAG: Enhancing Document Intelligence Through Human-LLM Symbiotic Collaboration
-
[1 May 2025] [arXiv 2025] LLM-Based Human-Agent Collaboration and Interaction Systems: A Survey
-
[13 Apr 2025] [arXiv 2025] EmoAgent: Assessing and Safeguarding Human-AI Interaction for Mental Health Safety
-
[11 Apr 2025] [arXiv 2025] MineWorld: a Real-Time and Open-Source Interactive World Model on Minecraft
-
[4 Apr 2025] [arXiv 2025] APIGen-MT: Agentic Pipeline for Multi-Turn Data Generation via Simulated Agent-Human Interplay
-
[24 Mar 2025] [ACL 2025 Findings] SPHERE: An Evaluation Card for Human-AI Systems
-
[19 Mar 2025] [arXiv 2025] SWEET-RL: Training Multi-Turn LLM Agents on Collaborative Reasoning Tasks
-
[10 Mar 2025] [arXiv 2025] Experimental Exploration: Investigating Cooperative Interaction Behavior Between Humans and Large Language Model Agents
-
[4 Mar 2025] [arXiv 2025] FinArena: A Human-Agent Collaboration Framework for Financial Market Analysis and Forecasting
-
[3 Mar 2025] [ICML 2025] M3HF: Multi-agent Reinforcement Learning from Multi-phase Human Feedback of Mixed Quality
-
[27 Feb 2025] [ICLR 2025] ConvCodeWorld: Benchmarking Conversational Code Generation in Reproducible Feedback Environments
-
[17 Feb 2025] [ACL 2025] Leveraging Dual Process Theory in Language Agent Framework for Real-time Simultaneous Human-AI Collaboration
-
[2 Feb 2025] [ICML 2025] CollabLLM: From Passive Responders to Active Collaborators
-
[28 Jan 2025] [NAACL 2025 Demo] CowPilot: A Framework for Autonomous and Human-Agent Collaborative Web Navigation
-
[25 Dec 2024] [IROS 2024] To Help or Not to Help: LLM-based Attentive Support for Human-Robot Group Interactions
-
[20 Dec 2024] [arXiv 2024] Collaborative Gym: A Framework for Enabling and Evaluating Human-Agent Collaboration
-
[8 Dec 2024] [arXiv 2024] Towards Modeling Human-Agentic Collaborative Workflows: A BPMN Extension
-
[26 Nov 2024] [arXiv 2024] Effect of Adaptive Communication Support on LLM-powered Human-Robot Collaboration
-
[31 Oct 2024] [ICLR 2025] PARTNR: A Benchmark for Planning and Reasoning in Embodied Multi-agent Tasks
-
[30 Oct 2024] [ICLR 2025] ACC-Collab: An Actor-Critic Approach to Multi-Agent LLM Collaboration
-
[16 Oct 2024] [ICLR 2025] Proactive Agent: Shifting LLM Agents from Reactive Responses to Active Assistance
-
[26 Sep 2024] [arXiv 2024] AssistantX: An LLM-Powered Proactive Assistant in Collaborative Human-Populated Environment
-
[25 Sep 2024] [arXiv 2024] AXIS: Efficient Human-Agent-Computer Interaction with API-First LLM-Based Agents
-
[13 Sep 2024] [arXiv 2024] Mutual Theory of Mind in Human-AI Collaboration: An Empirical Study with LLM-driven AI Agents in a Real-time Shared Workspace Task
-
[27 Aug 2024] [EMNLP 2024] Into the Unknown Unknowns: Engaged Human Learning through Participation in Language Model Agent Conversations
-
[12 Jul 2024] [SME 2024] Human-LLM collaboration in generative design for customization
-
[20 Jun 2024] [RAL 2024] Enhancing the LLM-Based Robot Manipulation Through Human-Robot Collaboration
-
[18 Jun 2024] [ICLR 2025] τ-bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains
-
[17 Jun 2024] [EMNLP 2024] Ask-before-Plan: Proactive Language Agents for Real-World Planning
-
[14 Jun 2024] [NeurIPS 2024] DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning
-
[4 Jun 2024] [CASE 2024] Enhancing Human-Robot Collaborative Assembly in Manufacturing Systems Using Large Language Models
-
[30 May 2024] [arXiv 2024] Safe Multi-agent Reinforcement Learning with Natural Language Constraints
-
[27 May 2024] [AAAI 2025] REVECA: Adaptive Planning and Trajectory-based Validation in Cooperative Language Agents using Information Relevance and Relative Proximity
-
[23 Apr 2024] [NeurIPS 2024] Aligning LLM Agents by Learning Latent Preference from User Edits
-
[18 Apr 2024] [arXiv 2024] AgentCoord: Visually Exploring Coordination Strategy for LLM-based Multi-Agent Collaboration
-
[5 Apr 2024] [IUI 2024] PDFChatAnnotator: A Human-LLM Collaborative Multi-Modal Data Annotation Tool for PDF-Format Catalogs
-
[19 Mar 2024] [arXiv 2024] Embodied LLM Agents Learn to Cooperate in Organized Teams
-
[8 Feb 2024] [arXiv 2024] WebLINX: Real-World Website Navigation with Multi-Turn Dialogue
-
[7 Feb 2024] [NeurIPS 2024] Can Large Language Model Agents Simulate Human Trust Behavior?
-
[25 Jan 2024] [arXiv 2024] A2C: A Modular Multi-stage Collaborative Decision Framework for Human-AI Teams
-
[23 Dec 2023] [AAMAS 2024] LLM-Powered Hierarchical Language Agent for Real-time Human-AI Coordination
-
[18 Oct 2023] [ICLR 2024] SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents
-
[19 Sep 2023] [WACV 2024] Drive as You Speak: Enabling Human-Like Interaction with Large Language Models in Autonomous Vehicles
-
[19 Sep 2023] [ICLR 2024] MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback
-
[18 Sep 2023] [NAACL 2024] MindAgent: Emergent Gaming Interaction
-
[1 Aug 2023] [ICML 2023] MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework
-
[5 Jul 2023] [ICLR 2024] Building Cooperative Embodied Agents Modularly with Large Language Models
-
[4 Jul 2023] [ICML 2023] Embodied Task Planning with Large Language Models
-
[1 Jun 2023] [IEEE 2023] Improved Trust in Human-Robot Collaboration With ChatGPT
-
[22 May 2023] [EACL 2024] Investigating Agency of LLMs in Human-AI Collaboration Tasks
-
[21 Apr 2023] [EACL 2024] Improving Grounded Language Understanding in a Collaborative Environment by Interacting with Agents Through Help Feedback
(©️click here back to table of contents👆🏻)
- [1 Apr 2026] [arXiv 2026] When Users Change Their Mind: Evaluating Interruptible Agents in Long-Horizon Web Navigation
-
[7 Oct 2025] [arXiv 2025] RECODE-H: A Benchmark for Research Code Development with Interactive Human Feedback
-
[30 Jul 2025] [arXiv 2025] Magentic-UI: Towards Human-in-the-loop Agentic Systems
-
[19 Mar 2025] [arXiv 2025] SWEET-RL: Training Multi-Turn LLM Agents on Collaborative Reasoning Tasks
-
[27 Feb 2025] [ICLR 2025] ConvCodeWorld: Benchmarking Conversational Code Generation in Reproducible Feedback Environments
-
[19 Sep 2023] [ICLR 2024] MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback
-
[26 Jun 2023] [NeurIPS 2023] InterCode: Standardizing and Benchmarking Interactive Coding with Execution Feedback
-
[31 Oct 2024] [ICLR 2025] PARTNR: A Benchmark for Planning and Reasoning in Embodied Multi-agent Tasks
-
[19 Sep 2023] [ICLR 2024] MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback
-
[5 Jul 2023] [ICLR 2024] Building Cooperative Embodied Agents Modularly with Large Language Models
-
[4 Jul 2023] [arXiv 2023] Embodied Task Planning with Large Language Models
-
[21 Apr 2023] [EACL 2024 Findings] Improving Grounded Language Understanding in a Collaborative Environment by Interacting with Agents Through Help Feedback
-
[24 Sep 2025] [arXiv 2025] UserRL: Training Proactive User-Centric Agent via Reinforcement Learning
-
[27 Aug 2024] [EMNLP 2024] Into the Unknown Unknowns: Engaged Human Learning through Participation in Language Model Agent Conversations
-
[24 May 2025] [ICLR 2025] Learning to Clarify: Multi-turn Conversations with Action-Based Contrastive Self-Training [Code]
-
[8 Feb 2024] [arXiv 2024] WebLINX: Real-World Website Navigation with Multi-Turn Dialogue
-
[19 Sep 2023] [ICLR 2024] MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback
-
[11 Apr 2025] [arXiv 2025] MineWorld: a Real-Time and Open-Source Interactive World Model on Minecraft
-
[18 Sep 2023] [ICLR 2024] MindAgent: Emergent Gaming Interaction
- [4 Mar 2025] [arXiv 2025] FinArena: A Human-Agent Collaboration Framework for Financial Market Analysis and Forecasting Data Link
-
[28 Jul 2025] [arXiv 2025] GenoMAS: A Multi-Agent Framework for Scientific Discovery via Code-Driven Gene Expression Analysis
-
[13 Apr 2025] [arXiv 2025] EmoAgent: Assessing and Safeguarding Human-AI Interaction for Mental Health Safety
-
[9 Jun 2025] [arXiv 2025] τ2-Bench: Evaluating Conversational Agents in a Dual-Control Environment
-
[18 Jun 2024] [ICLR 2025] τ-bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains
-
[29 Jul 2025] [arXiv 2025] UserBench: An Interactive Gym Environment for User-Centric Agents
-
[9 Jun 2025] [arXiv 2025] τ2-Bench: Evaluating Conversational Agents in a Dual-Control Environment
-
[18 Jun 2024] [ICLR 2025] τ-bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains
-
[21 May 2025] [arXiv 2025] Prototypical Human-AI Collaboration Behaviors from LLM-Assisted Writing in the Wild
-
[16 May 2025] [arXiv 2025] XtraGPT: LLMs for Human-AI Collaboration on Controllable Academic Paper Revision
(©️click here back to table of contents👆🏻)
For a detailed introduction of the taxonomy, please refer to Section 3 in our survey paper: LLM-Based Human-Agent Collaboration and Interaction Systems: A Survey.
(©️click here back to table of contents👆🏻)
Human Feedback can occur during different phases in various types and granularities. In the following table, we summarize different dimensions of Human Feedback in LLM-based human-agent systems, including feedback type, granularity, and phase. For each dimension, a summary, key characteristics, and example works are provided for comparison. More details are in Section 3.2 of our survey paper.
(©️click here back to table of contents👆🏻)
(©️click here back to table of contents👆🏻)
(©️click here back to table of contents👆🏻)
(©️click here back to table of contents👆🏻)
Contributions are welcome! If you have relevant papers, code, or insights, feel free to submit a request 🤗.
(©️click here back to table of contents👆🏻)
If you find this repository useful, please consider citing our papers 💕:
@misc{zou2025llmbasedhumanagentcollaborationinteraction,
title={LLM-Based Human-Agent Collaboration and Interaction Systems: A Survey},
author={Henry Peng Zou and Wei-Chieh Huang and Yaozu Wu and Yankai Chen and Chunyu Miao and Hoang Nguyen and Yue Zhou and Weizhi Zhang and Liancheng Fang and Langzhou He and Yangning Li and Dongyuan Li and Renhe Jiang and Xue Liu and Philip S. Yu},
year={2025},
eprint={2505.00753},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2505.00753},
}
@misc{zou2025collaborativeintelligencehumanagentsystems,
title={A Call for Collaborative Intelligence: Why Human-Agent Systems Should Precede AI Autonomy},
author={Henry Peng Zou and Wei-Chieh Huang and Yaozu Wu and Chunyu Miao and Dongyuan Li and Aiwei Liu and Yue Zhou and Yankai Chen and Weizhi Zhang and Yangning Li and Liancheng Fang and Renhe Jiang and Philip S. Yu},
year={2025},
eprint={2506.09420},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2506.09420},
}



