GitHub - yc1999/LLM2

LLM2: Let Large Language Models Harness System 2 Reasoning

🎉 What's New

[2024.12.31] Repository of LLM2 is released. The repository is still under active refinement and improvement.

🎈 Introduction

We introduce LLM2, a novel framework that combines an LLM (System 1) with a process-based verifier (System 2). Within LLM2, the LLM is responsible for generating plausible candidates, while the verifier provides timely process-based feedback to distinguish desirable and undesirable outputs. The verifier is trained with a pairwise comparison loss on synthetic process-supervision data generated through our token quality exploration strategy.

🚀 Quick Start

This section provides a quick guide to using the LLM2 framework with Llama-3.2-1B-Instruct on the GSM8k dataset.

Step 1: Install Environment

pip install -r requirements.txt

Step 2: Training

bash scripts/train_verifier.sh

Step 3: Inference

bash scripts/evaluate_gsm8k.sh

📚 Dataset

Datasets for models Llama-3.2-1B-Instruct, Llama-3.2-3B-Instruct, and Meta-Llama-3.1-8B-Instruct are located in the ./dataset directory. Each dataset file, such as Llama-3.2-1B-Instruct.jsonl, follows this structure:

{
  "messages": [
    {"content": "Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?", "role": "user"},
    {"content": "Natalia sold 48 / 2 = 24 clips in May. Natalia sold 48 + 24 = 72 clips altogether in April and May. So the answer is 72.", "role": "assistant"}
  ],
  "samples": [
    [52,220,264],[52,220,264],[53,2166,17],[53,2166,1187],[54,611,489],[54,611,353],[55,220,19],[55,220,662],[56,17,18],[56,17,19],[57,284,353],[57,284,353],[58,220,27203],[58,220,27203],[59,1187,1419],
    [59,1187,17],[68,2166,17],[68,2166,19],[69,489,865],[69,489,353],[70,220,508],[70,220,914],[71,1187,2166],[71,1187,914],[74,5332,717],[74,5332,1187],[87,5332,8929],[87,5332,8929]
  ]
}

messages: OpenAI-style chat messages.
samples: List of sampled negative tokens. Each triplet contains:
- Token position (after applying the chat template and tokenizing messages).
- Ground truth token ID.
- Negative token ID.

📑 Citation

If you find this repository useful, please consider giving star and citing our paper:

@article{yang2024llm,
  title = {LLM2: Let Large Language Models Harness System 2 Reasoning},
  author = {Yang, Cheng and Shi, Chufan and Li, Siheng and Shui, Bo and Yang, Yujiu and Lam, Wai},
  year={2024},
  journal={arXiv preprint arXiv:2412.20372},
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
benchmarks		benchmarks
config_files		config_files
dataset		dataset
figs		figs
scripts		scripts
LICENSE		LICENSE
README.md		README.md
argsearch.py		argsearch.py
build_data.py		build_data.py
build_data_batch.py		build_data_batch.py
configs.py		configs.py
convert_dataset.py		convert_dataset.py
data_utils.py		data_utils.py
decontaminate.py		decontaminate.py
evaluation.py		evaluation.py
inference.py		inference.py
model_analysis.py		model_analysis.py
model_utils.py		model_utils.py
modeling_attn_mask_utils.py		modeling_attn_mask_utils.py
qwen_verifier.py		qwen_verifier.py
requirements.txt		requirements.txt
train_verifier.py		train_verifier.py
utils_stopping.py		utils_stopping.py
verifier.py		verifier.py
verifier_trainer.py		verifier_trainer.py
verify_search.py		verify_search.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM2: Let Large Language Models Harness System 2 Reasoning

🎉 What's New

🎈 Introduction

🚀 Quick Start

Step 1: Install Environment

Step 2: Training

Step 3: Inference

📚 Dataset

📑 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LLM2: Let Large Language Models Harness System 2 Reasoning

🎉 What's New

🎈 Introduction

🚀 Quick Start

Step 1: Install Environment

Step 2: Training

Step 3: Inference

📚 Dataset

📑 Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages