Jae-Woo Choi1, 2, Hyungmin Kim 2, Hyobin Ong2, Youngwoo Yoon1, 2, Minsu Jang1, 2, Dohyung Kim1, 2, Jaehong Kim1
1 Electronics and Telecommunications Research Institute, 2 University of Science and Technology
Here is the official implementation of ReAcTree, a hierarchical task-planning framework designed to solve complex, long-horizon tasks. Instead of relying on a single, monolithic trajectory, ReAcTree dynamically constructs an agent tree by decomposing a complex goal into manageable subgoals. This tree is composed of LLM agent nodes—which can reason, act, and further expand the tree —and control flow nodes (e.g., sequence, fallback, parallel) that coordinate the execution strategy. The framework is supported by two complementary memory systems: episodic memory to retrieve relevant subgoal-level examples for in-context learning and working memory to share environmental observations (like object locations) across all nodes. ReAcTree was validated on the WAH-NL and ALFRED benchmarks , where it consistently and significantly outperformed strong baselines like ReAct.
Ubuntu 14.04+ is required. The scripts were developed and tested on Ubuntu 22.04 and Python 3.8.
You can use WSL-Ubuntu on Windows 10/11.
-
Clone the whole repo.
$ git clone {repo_url} -
Setup a virtual environment.
$ conda create -n {env_name} python=3.8 $ conda activate {env_name} -
Install PyTorch (2.3.1) first (see https://pytorch.org/get-started/locally/).
# exemplary install command for PyTorch 2.3.1 with CUDA 11.8 $ pip install torch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 --index-url https://download.pytorch.org/whl/cu118 -
Install python packages in
requirements.txt.$ pip install -r requirements.txt
- Prepare VirtualHome simulator v.2.2.2
$ cd {project_root}/virtualhome/simulation/unity_simulator/ $ wget http://virtual-home.org//release/simulator/v2.0/v2.2.2/linux_exec.zip $ unzip linux_exec.zip - Prepare ALFRED dataset
$ cd {project_root}/alfred/data $ sh download_data.sh json
- Open a new terminal and run VirtualHome simulator
$ cd {project_root}
$ ./virtualhome/simulation/unity_simulator/linux_exec.x86_64- Open a new terminal and evaluate
$ cd {project_root}
$ python src/evaluate.py --config-name=wah_reactree exp_type=evaluate llm_agent.model_name=meta-llama/Meta-Llama-3.1-8B llm_agent.working_memory=True prompt.sys_prompt_root_dir=resource/wah/sys_prompt prompt.ic_ex_root_dir=resource/wah/em_llm llm_agent.ic_ex_select_type=rag llm_agent.max_steps=199 llm_agent.max_decisions=199 - Open a new terminal and run Xserver
# requires sudo permission
$ cd {project_root}/virtualhome
$ sudo python3 helper_scripts/startx.py $display_num- Open another terminal and run unity simulator
$ cd {project_root}/virtualhome
$ DISPLAY=:$display_num ./simulation/unity_simulator/linux_exec.x86_64 -batchmode- Open another terminal and evaluate
$ python src/evaluate.py --config-name=wah_headless_reactree exp_type=evaluate llm_agent.model_name=meta-llama/Meta-Llama-3.1-8B llm_agent.working_memory=True prompt.sys_prompt_root_dir=resource/wah/sys_prompt prompt.ic_ex_root_dir=resource/wah/em_llm llm_agent.ic_ex_select_type=rag llm_agent.max_steps=199 llm_agent.max_decisions=199 - Open a new terminal and collect human trajectories
$ cd {project_root}
$ python src/collect_human.py --config-name=wah_reactree llm_agent.working_memory=True dataset.collect_ex_root_dir=resource/wah- After collecting human trajectories, embed the trajectories
$ cd {project_root}
$ python src/embed_em.py --config-name=wah_reactree exp_type=embed_em llm_agent.working_memory=True dataset.embedding_root_dir='resource/wah/collect_human' dataset.em_root_dir='resource/wah/em_human'- Open a new terminal and collect llm trajectories
$ cd {project_root}
$ python src/collect_llm.py --config-name=wah_reactree exp_type=collect_llm llm_agent.model_name=meta-llama/Meta-Llama-3.1-70B llm_agent.working_memory=True dataset.collect_ex_root_dir=resource/wah prompt.sys_prompt_root_dir=resource/wah/sys_prompt prompt.ic_ex_root_dir=resource/wah/em_human llm_agent.ic_ex_select_type=rag llm_agent.max_steps=199 llm_agent.max_decisions=199 - After bootstrapping, embed the trajectories
$ cd {project_root}
$ python src/embed_em.py --config-name=wah_reactree exp_type=embed_em llm_agent.working_memory=True dataset.embedding_root_dir=resource/wah/collect_llm dataset.em_root_dir=resource/wah/em_llm- Open a new terminal and evaluate
$ cd {project_root}
$ python src/evaluate.py --config-name=alfred_reactree exp_type=evaluate dataset.eval_set=valid_seen llm_agent.model_name=meta-llama/Meta-Llama-3.1-8B llm_agent.working_memory=True prompt.sys_prompt_root_dir=resource/alfred/sys_prompt prompt.ic_ex_root_dir=resource/alfred/em_llm llm_agent.ic_ex_select_type=rag llm_agent.max_steps=100 llm_agent.max_decisions=100- Please run
startx.pyscript before running ALFRED experiment on headless servers. Below script uses 1 for the X_DISPLAY id, but you can use different ids such as 0.
# requires sudo permission
$ sudo python3 alfred/scripts/startx.py 1- Open a new terminal and collect human trajectories
$ cd {project_root}
$ python src/collect_human.py --config-name=alfred_reactree llm_agent.working_memory=True dataset.collect_ex_root_dir=resource/alfred- After collecting human trajectories, embed the trajectories
$ cd {project_root}
$ python src/embed_em.py --config-name=alfred_reactree llm_agent.working_memory=True dataset.embedding_root_dir='resource/alfred/collect_human' dataset.em_root_dir='resource/alfred/em_human'- Open a new terminal and collect llm trajectories
$ cd {project_root}
$ python src/collect_llm.py --config-name=alfred_reactree exp_type=collect_llm llm_agent.model_name=meta-llama/Meta-Llama-3.1-8B llm_agent.working_memory=True dataset.collect_ex_root_dir=resource/alfred prompt.sys_prompt_root_dir=resource/alfred/sys_prompt prompt.ic_ex_root_dir=resource/alfred/em_human llm_agent.ic_ex_select_type=rag llm_agent.max_decisions=100 alfred.diverse_task=True - After bootstrapping, embed the trajectories
$ cd {project_root}
$ python src/embed_em.py --config-name=alfred_reactree dataset.check_success=True llm_agent.working_memory=True dataset.embedding_root_dir='resource/alfred/collect_llm' dataset.em_root_dir='resource/alfred/em_llm'-
Running out of disk space for Huggingface models
- You can set the cache folder to be in another disk.
$ export TRANSFORMERS_CACHE=/mnt/otherdisk/.hf_cache/
- You can set the cache folder to be in another disk.
-
I have encountered 'cannot find X server with xdpyinfo' in running ALFRED experiments.
- Please try another x_display number (this should be a string; e.g., '1') in the config file.
$ python src/evaluate.py --config-name=config_alfred alfred.x_display='1'
- Please try another x_display number (this should be a string; e.g., '1') in the config file.
-
I am encountering a Huggingface authentication error during evaluation.
- You need to log in using the huggingface-cli first. Open a new terminal, log in, and then run your evaluation.
$ huggingface-cli login $ cd {project_root} $ python src/evaluate.py --config-name=wah_reactree exp_type=evaluate llm_agent.model_name=meta-llama/Meta-Llama-3.1-8B llm_agent.working_memory=True prompt.sys_prompt_root_dir=resource/wah/sys_prompt prompt.ic_ex_root_dir=resource/wah/em_llm llm_agent.ic_ex_select_type=rag llm_agent.max_steps=199 llm_agent.max_decisions=199
- You need to log in using the huggingface-cli first. Open a new terminal, log in, and then run your evaluation.
@misc{choi2025reactreehierarchicalllmagent,
title={ReAcTree: Hierarchical LLM Agent Trees with Control Flow for Long-Horizon Task Planning},
author={Jae-Woo Choi and Hyungmin Kim and Hyobin Ong and Minsu Jang and Dohyung Kim and Jaehong Kim and Youngwoo Yoon},
year={2025},
eprint={2511.02424},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2511.02424},
}

