Code for The
Direct LLM inference and standard RLM inference are constrained by context windows and can rely on hard-to-predict decomposition strategies.
- Planning decomposition ahead of execution with a deterministic recursive strategy
- Expressing inference through functional structure, with model calls at local steps and symbolic operators for composition
- Breaking long inputs into manageable chunks that fit within the model context window
- Applying the model only to bounded leaf subproblems and combining intermediate results through structured operators such as
SPLIT,MAP,FILTER,REDUCE,CONCATandCROSS.
From the project root (the directory containing pyproject.toml):
conda create -n lambda-rlm python=3.11 -y
conda activate lambda-rlm
pip install -e .The project supports multiple API-compatible model providers. For example, you can request a NVIDIA NIM API key or a TOGETHER AI API key to access the available model backends. Set your API key as an environment variable:
export NVIDIA_API_KEY="nvapi-..."export TOGETHER_API_KEY="tgp_..."sniah— Sequential-NIAH examples loaded from the public GitHub JSONL sourceoolong— single-document QA examples loaded fromTHUDM/LongBench-v2browsecomp— multi-document QA examples loaded fromTHUDM/LongBench-v2codeqa— code repository understanding examples loaded from a local JSONL file or fromTHUDM/LongBench-v2
import os
from rlm import LambdaRLM
document = """
This report discusses the development of a new battery technology.
It covers technical design choices, manufacturing trade-offs, safety concerns,
cost reduction strategies, and future commercialization plans.
One section focuses on performance improvements in energy density.
Another section discusses supply-chain risks and regulatory constraints.
The final section outlines expected market impact and open research questions.
"""
prompt = f"""Context:
{document}
Question: Summarize the main ideas discussed in this document.
Answer:"""
rlm = LambdaRLM(
backend_kwargs={
"model_name": "meta/llama-3.3-70b-instruct",
"api_key": os.environ["NVIDIA_API_KEY"],
"base_url": "https://integrate.api.nvidia.com/v1",
}
)
result = rlm.completion(prompt)
print(result.response)This repository uses upstream Normal RLM components for comparison: https://github.com/alexzhang13/rlm
The upstream code is licensed under the MIT License. See THIRD_PARTY_NOTICES.md for attribution and licensing details.
Key files:
rlm/core/rlm.py— main REPL-based RLM looprlm/environments/local_repl.py— sandboxed Python REPL execution, context storage, and helper functionsrlm/utils/parsing.py— parsing ofreplcode blocks and FINAL markers; formatting of execution output back into the model historyrlm/clients/openai.py— OpenAI-compatible client used with NVIDIA NIM
-
rlm/lambda_rlm.py— LambdaRLM implementation, including task detection, planning, and deterministic execution through$\Phi$
The benchmark entry point is used to run the supported datasets under the same setup and compare behavior, latency and output quality across Normal RLM (rlm) and Lambda-RLM (lambda_rlm).
python benchmarks/benchmark.py --datasets sniah --model meta/llama-3.3-70b-instruct --methods rlm lambda_rlm --n-samples-per-bucket 2 --max-iter 8 --max-depth 2 --context-window 100000 --output-dir ./results/llama-3.3-70b-instructOutputs are written to the specified output directory, typically including:
results.jsonstats.jsonaverages.json
python benchmarks/benchmark.py --datasets sniah --model meta/llama-3.3-70b-instruct --methods rlm --n-samples-per-bucket 2 --max-iter 8 --max-depth 2 --context-window 100000 --output-dir ./results/llama-3.3-70b-instruct_rlmpython benchmarks/benchmark.py --datasets sniah --model meta/llama-3.3-70b-instruct --methods lambda_rlm --n-samples-per-bucket 2 --max-iter 8 --max-depth 2 --context-window 100000 --output-dir ./results/llama-3.3-70b-instruct_lambda_rlm