MemGovern enhances SWE-Agent by injecting governance-aware experience memories into the agent's reasoning loop. When facing a new GitHub issue, the agent retrieves similar past experiences and learns from successful resolution patterns.
π New Issue β π Memory Retrieval β π Experience Injection β π§ Enhanced Reasoning β β Better Patches
| Model | SWE-Agent | MemGovern (Ours) | Improvement |
|---|---|---|---|
| Claude-4-Sonnet | 66.6% | 69.8% | +3.2 |
| GPT5-Medium | 65.0% | 67.4% | +2.4 |
| DeepSeek-V3.1T | 62.8% | 65.8% | +3.0 |
| Qwen3-235B | 47.2% | 55.4% | +8.2 |
| Kimi-K2-Instruct | 43.8% | 51.8% | +8.0 |
| GPT-4o | 23.2% | 32.6% | +9.4 |
| GPT-4o-Mini | 14.0% | 17.2% | +3.2 |
From "Solving from Scratch" β To "Learning from Experience"
MemGovern/
βββ data/ # π¦ Experience DB artifacts (Git LFS)
β βββ agentic_exp_data_1220_13w_DSnewPrompt/
β βββ experience_data.json
β βββ chroma_db_experience/
βββ trajectories/ # ποΈ Model trajectory archives (Git LFS)
β βββ gpt4o_*.tar.gz
β βββ gemini3_pro_trajectory.tar.gz
β βββ ...
βββ config/ # βοΈ SWE-Agent compatible YAML configs
β βββ benchmarks/ # Benchmark sweep configurations
β βββ demo/ # Lightweight demo presets
β βββ human/ # Human study protocols
β βββ exotic/ # Ablation experiment settings
βββ tools/ # π§ Memory pipeline utilities
β βββ experience_server.py
β βββ issue_memory_rag/
β βββ exp_search/
β βββ ...
βββ scripts/ # π Data collection scripts
β βββ github_scraper.py
β βββ experience_process.py
βββ figs/ # πΌοΈ Publication-ready figures
βββ requirements.txt # π¦ Runtime deps (installs SWE-agent + utilities)
MemGovern is implemented as memory tools + configs on top of SWE-agent. A full run uses two terminals:
- Terminal A: start the Experience Server (vector search + experience lookup)
- Terminal B: run SWE-agent on SWE-bench with a MemGovern config that calls the server tools
Requirements: Linux (or WSL2), Python β₯ 3.11, Git, Docker.
WSL2 note: Windows drives are mounted under
/mnt/(e.g.,E:\β/mnt/e/).
git clone https://github.com/QuantaAlpha/MemGovern.git
cd MemGovern
python3 -m venv SWE
source SWE/bin/activate
pip install -U pip
pip install -r requirements.txtThe Experience Server needs two artifacts:
experience_data.json: governed experience cards (key β structured fields, includingbug_description/fix_experience)chroma_db_experience/: a persistent ChromaDB store used for semantic retrieval
In this repository, we provide them under:
data/agentic_exp_data_1220_13w_DSnewPrompt/(tracked via Git LFS)
Place them in a directory (example layout):
<EXPERIENCE_DATA_DIR>/
βββ experience_data.json
βββ chroma_db_experience/
βββ chroma.sqlite3
βββ <uuid>/
βββ data_level0.bin
βββ header.bin
βββ index_metadata.pickle
βββ length.bin
βββ link_lists.bin
Notes:
- These artifacts are large; we recommend hosting them via Git LFS or a separate dataset release.
- Retrieval quality depends on using the same embedding model at serving time as was used to build the ChromaDB store.
In our internal runs, we keep these files under a folder named
agentic_exp_data_1220_13w_DSnewPrompt/.
cd <MEMGOVERN_ROOT>/data/agentic_exp_data_1220_13w_DSnewPrompt
source <MEMGOVERN_ROOT>/SWE/bin/activate
export DB_DIR="$PWD/chroma_db_experience"
export JSON_DATA_PATH="$PWD/experience_data.json"
export MODEL_PATH="<PATH_OR_MODEL_ID_FOR_SENTENCE_TRANSFORMERS>"
export HOST="0.0.0.0"
export PORT="9030"
python <MEMGOVERN_ROOT>/tools/experience_server.pyHow to confirm it is running
In another shell:
curl -s http://localhost:9030/healthYou should also see log lines like:
[TOOL] /search ...[TOOL] /get_experience ...
when the agent uses the tools (this is the run-through evidence we use).
Before running, edit config/dsv31t_agenticMemSearch_1220_13w.yaml and replace:
agent.model.api_base: YOUR_API_BASEagent.model.api_key: YOUR_API_KEY
cd <MEMGOVERN_ROOT>
source SWE/bin/activate
sweagent run-batch \
--config config/dsv31t_agenticMemSearch_1220_13w.yaml \
--instances.type swe_bench \
--instances.subset verified \
--instances.split test \
--num_workers 12 \
--instances.shuffle=FalseAbout the config β server wiring
config/dsv31t_agenticMemSearch_1220_13w.yaml sets tool endpoints:
GRAPH_EXP_SEARCH_URL:http://host.docker.internal:9030/searchGRAPH_EXP_READ_URL:http://host.docker.internal:9030/get_experience
This is the recommended setup when SWE-agent runs tasks inside Docker and the Experience Server runs on the host.
After the run finishes, evaluate the produced predictions:
python -m swebench.harness.run_evaluation \
--predictions_path <PATH_TO_PREDS_JSON> \
--dataset_name princeton-nlp/SWE-bench_Verified \
--run_id <RUN_ID> \
--max_workers 8The predictions file is typically named
preds.jsonunder your runβstrajectories/output directory.If
python -m swebench...is not available in your environment, install the SWE-bench harness following the official SWE-bench instructions.
Scrape GitHub PR data (metadata + patch + comments):
export GITHUB_TOKEN=your_github_token
python scripts/github_scraper.py \
--csv-path <PATH_TO_INPUT_CSV> \
--output-dir <OUTPUT_DIR> \
--chunk-size 200We provide experience_process.py to transform issue/PR/patch fields into governed experience cards using an LLM.
It reads an input parquet table and writes JSONL/parquet with the Experience Card fields.
export API_KEY=your_llm_key
export BASE_URL=your_llm_base_url # optional if using OpenAI default
export MODEL=your_model_name
python scripts/experience_process.py \
--input <INPUT_PARQUET> \
--output <OUTPUT_JSONL_OR_PARQUET> \
--output-format jsonl \
--max-workers 200Launch the memory retrieval service (see βReproducing MemGovernβ above). The server reads these env vars:
DB_DIRJSON_DATA_PATHMODEL_PATHHOST(default0.0.0.0)PORT(default9030)
| Config | Use Case |
|---|---|
config/benchmarks/*.yaml |
Full benchmark sweeps with different governance settings |
config/demo/*.yaml |
Quick demos with minimal latency |
config/human/*.yaml |
Human evaluation study protocols |
config/exotic/*.yaml |
Ablation: windowed replace, late reproduction |
We welcome contributions of all kindsβnew configs, tools, bug fixes, or documentation improvements!
- π Bug Reports: Open an issue
- π‘ New Configs: Add timestamped YAML files under
config/ - π§ New Tools: Extend the
tools/directory with your utilities - π Trajectories: Share model runs via Git LFS
Note: Large files (>50 MB) should use Git LFS. Run
git lfs ls-filesbefore committing.
Special thanks to:
- SWE-Agent - The foundation agent framework
- RepoMaster - Autonomous repository exploration
- SWE-Bench - The evaluation benchmark
- ChromaDB - Vector database for memory retrieval
QuantaAlpha was founded in April 2025 by researchers from Tsinghua University, Peking University, CAS, CMU, HKUST, and more.
π Our mission: Explore the "quantum" of intelligence and pioneer the "alpha" frontier of agent research.
β¨ Research Directions:
- CodeAgent: End-to-end autonomous task execution
- DeepResearch: Deep reasoning & retrieval-augmented intelligence
- Agentic RL: Agent-based reasoning and reinforcement learning
- Self-evolution: Multi-agent coordination and learning
π Team Homepage: QuantaAlpha
π§ Email: quantaalpha.ai@gmail.com
β If MemGovern helps your research, please give us a star!
Made with β€οΈ by the QuantaAlpha Team



