Skip to content

SWE-Lego/SWE-Lego

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

41 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

SWE-Lego: Pushing the Limits of Supervised Fine-tuning for Software Issue Resolving

πŸ€— HF Dataset β€’ πŸ€— SWE-Lego-Qwen3-8B/32B β€’ πŸ§‘β€πŸ’» Code β€’ πŸ“– Paper

Github Hugging Face Collection ζœΊε™¨δΉ‹εΏƒ Website



We present SWE-Lego, a supervised fine-tuning (SFT) recipe designed to achieve state-of-the-art performance in software engineering (SWE) issue resolving. SWE-Lego comprises three core building blocks:

  • the SWE-Lego dataset, a collection of 32k highquality task instances and 18k validated trajectories, combining real and synthetic data to complement each other in both quality and quantity;
  • a refined SFT procedure with error masking and a difficulty-based curriculum, which demonstrably improves action quality and overall performance;
  • a well-trained verifier for improving test-time scaling (TTS).

Our fine-tuned policy models are trained exclusively with SFT from Qwen3-8B and Qwen3-32B. Their effectiveness is demonstrated on SWE-Bench-Verified:

For verifier-based TTS, we also release SWE-Lego-Verifier-8B and SWE-Lego-Verifier-30B-A3B, together with the verifier dataset SWE-Lego/SWE_Lego_real_data_Verifier.

We’ve open-sourced everythingβ€”our datasets, models, code, and training scriptsβ€”for everyone to progress on scaling and improving software engineering agents.

Reproduction Guide 🎯

1. πŸ“¦ Installation

git clone https://github.com/SWE-Lego/SWE-Lego.git

1.1 Installing vllm environment

conda create -n vllm python=3.12 -y
conda activate vllm
pip install vllm

1.2 Installing openhands environment

You can refer to the Development Guide from Openhands.

cd SWE-Lego/OpenHands-0.53.0
conda create -n openhands python=3.12 -y
conda activate openhands
conda install -c conda-forge nodejs=24.4.1 
conda install -c conda-forge poetry=2.1.4
pip install python-dateutil==2.9.0.post0
poetry run pip install datasets
make build

1.3 Installing swebench environment

cd SWE-Lego/SWE-bench-4.0.4
conda create -n swebench python=3.12 -y
conda activate swebench
pip install -e .

1.4 Installing llamafactory environment

cd SWE-Lego/LLaMA-Factory-0.9.4.dev0
conda create -n lf python=3.12 -y
conda activate lf

pip install torch==2.8.0 torchvision==0.23.0 torchaudio==2.8.0 --index-url https://download.pytorch.org/whl/cu128
pip install -e ".[torch,metrics,deepspeed,liger-kernel]" --no-build-isolation

# install flash-attn
wget https://github.com/Dao-AILab/flash-attention/releases/download/v2.8.3/flash_attn-2.8.3+cu12torch2.8cxx11abiFALSE-cp312-cp312-linux_x86_64.whl
pip install flash_attn-2.8.3+cu12torch2.8cxx11abiFALSE-cp312-cp312-linux_x86_64.whl

pip install wandb

πŸ€– 2. Inference and Evaluation of SWE-Lego-Qwen3-8B/32B

We take the SWE-Lego-Qwen3-32B for an example.

2.1 Serving the model via vllm

bash scripts/swe_lego_qwen3_32b/serve_vllm.sh

2.2 Running inference via openhands

bash scripts/swe_lego_qwen3_32b/infer.sh

2.3 Running evaluation via swebench

bash scripts/swe_lego_qwen3_32b/eval.sh

πŸ”₯ 3. Training of SWE-Lego-Qwen3-8B/32B

3.1 Downloading trajectories for SFT from Hugging Face

Save the downloaded trajectories to LLaMA-Factory-0.9.4.dev0/data

import json
from datasets import load_dataset

datasets = [
    {
        "name": "SWE-Lego/SWE-Lego-Real-Data",
        "filename": "swe_lego_real_data_resolved_trajectories.json"
    },
    {
        "name": "SWE-Lego/SWE-Lego-Synthetic-Data", 
        "filename": "swe_lego_synthetic_data_resolved_trajectories.json"
    }
]

for config in datasets:
    ds = load_dataset(config["name"], split="resolved")
    processed_ds = ds.select_columns(["instance_id", "messages"])
    data_list = processed_ds.to_list()
    
    with open(config["filename"], "w", encoding="utf-8") as f:
        json.dump(data_list, f, ensure_ascii=False, indent=4)
    print(f"Saved {len(data_list)} records to {config['filename']}")

3.2 Running SFT via llamafactory

bash scripts/swe_lego_qwen3_8b/sft.sh
bash scripts/swe_lego_qwen3_32b/sft.sh

πŸ§ͺ 4. Verifier Training, Inference

In our TTS setting, we find generative verifier selection works better than regressive alternatives, and our verifier-enabled setup reaches 49.6% TTS@16 (SWE-Lego-Qwen3-8B) and 58.8% TTS@16 (SWE-Lego-Qwen3-32B).

4.1 Verifier data and conversion

Verifier training data is released at SWE-Lego/SWE_Lego_real_data_Verifier. Each verifier sample is built from three parts: trajectory + patch + judgement. At inference time, only trajectory + patch are used as input; judgement is predicted by the verifier.

Use this script to convert raw rollout trajectories into verifier inference format:

python LLaMA-Factory-0.9.4.dev0/tts/convert_trajectories_to_verifier.py \
  --input /path/to/raw_trajectories.jsonl \
  --output /path/to/verifier_input.jsonl

The converter reads instance-level trajectories (e.g., run_1, run_2, ... with funccalloff_messages, patch, and resolved) and writes verifier-ready JSONL with system + user messages.

4.2 Train verifier models

Verifier SFT configs:

  • LLaMA-Factory-0.9.4.dev0/examples/train_full/swe_lego_verifier_qwen3_8b.yaml
  • LLaMA-Factory-0.9.4.dev0/examples/train_full/swe_lego_verifier_qwen3_30b_a3b.yaml

Training launch scripts:

  • scripts/swe_lego_verifier_qwen3_8b/sft.sh
  • scripts/swe_lego_verifier_qwen3_30b_a3b/sft.sh

Run from repo root:

bash scripts/swe_lego_verifier_qwen3_8b/sft.sh
bash scripts/swe_lego_verifier_qwen3_30b_a3b/sft.sh

4.3 Run verifier inference

Inference launch scripts:

  • scripts/swe_lego_verifier_qwen3_8b/infer.sh
  • scripts/swe_lego_verifier_qwen3_30b_a3b/infer.sh

Example:

bash scripts/swe_lego_verifier_qwen3_8b/infer.sh /path/to/verifier_input.jsonl
bash scripts/swe_lego_verifier_qwen3_30b_a3b/infer.sh /path/to/verifier_input.jsonl

The inference output contains per-instance run scores (including predicted_score), which can be used for ranking rollouts in parallel TTS.

Acknowledgements

This project acknowledges the valuable contributions of the following open-source repositories:

Citation πŸ“

Please cite our paper if you find the repo helpful in your work:

@misc{swelego,
      title={SWE-Lego: Pushing the Limits of Supervised Fine-tuning for Software Issue Resolving}, 
      author={Chaofan Tao and Jierun Chen and Yuxin Jiang and Kaiqi Kou and Shaowei Wang and Ruoyu Wang and Xiaohui Li and Sidi Yang and Yiming Du and Jianbo Dai and Zhiming Mao and Xinyu Wang and Lifeng Shang and Haoli Bai},
      year={2026},
      eprint={2601.01426},
      archivePrefix={arXiv},
      primaryClass={cs.SE},
      url={https://arxiv.org/abs/2601.01426}, 
}

About

SWE-Lego: Pushing the Limits of Supervised Fine-tuning for Software Issue Resolving

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors