Skip to content

[BREAKING][rollout] refactor: move LLMServerManager out of AgentLoopManager#6129

Merged
wuxibin89 merged 2 commits intoverl-project:mainfrom
wuxibin89:wuxibin/refactor_llm_server
Apr 29, 2026
Merged

[BREAKING][rollout] refactor: move LLMServerManager out of AgentLoopManager#6129
wuxibin89 merged 2 commits intoverl-project:mainfrom
wuxibin89:wuxibin/refactor_llm_server

Conversation

@wuxibin89
Copy link
Copy Markdown
Collaborator

@wuxibin89 wuxibin89 commented Apr 23, 2026

What does this PR do?

AgentLoopManager is one specific agent-framework implementation in verl, and is designed to be fully replaceable by other agent frameworks such as:

Previously the LLM server replicas (launch / tear-down / load balancer / profiling / KV-cache clearing) were owned by AgentLoopManager, which forced every alternative agent framework to either inherit from AgentLoopManager or re-implement the rollout server plumbing. This made integration of third-party agent frameworks inconvenient and entangled server life-cycle with agent scheduling.

This PR extracts LLM-server management into a standalone module verl/workers/rollout/llm_server.py, so that any agent framework can reuse the same rollout servers by consuming an LLMServerClient.

image

Compatibility

Breaking change for out-of-tree agent frameworks that imported
AsyncLLMServerManager / FullyAsyncLLMServerManager from
verl.experimental.agent_loop — import from
verl.workers.rollout.llm_server and use the new names LLMServerClient /
FullyLLMServerClient instead. AgentLoopManager.create(...) signature also
changed (see change #3).

Test

  • Updated tests/checkpoint_engine/test_special_server_adapter.py and
    tests/experimental/agent_loop/* to the new APIs.
  • Docs (docs/advance/agent_loop.rst, docs/start/agentic_rl.rst) updated.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the LLM server management architecture by introducing LLMServerManager and LLMServerClient to replace the previous AsyncLLMServerManager implementation. The core logic for server lifecycle management and load balancing has been moved to verl/workers/rollout/llm_server.py, while AgentLoopManager and AgentLoopWorker have been updated to use the new client-based interface. Additionally, the FullyAsyncAgentLoopManager was refactored and moved to the fully async policy module, and corresponding updates were made across documentation, tests, and various trainer implementations to align with these changes. I have no feedback to provide.

@wuxibin89 wuxibin89 changed the title [rollout] refactor: move LLMServerManager out of AgentLoopManager [1/2][rollout] refactor: move LLMServerManager out of AgentLoopManager Apr 23, 2026
@wuxibin89 wuxibin89 changed the title [1/2][rollout] refactor: move LLMServerManager out of AgentLoopManager [BREAKING][rollout] refactor: move LLMServerManager out of AgentLoopManager Apr 24, 2026
@PeterSH6 PeterSH6 self-assigned this Apr 24, 2026
Copy link
Copy Markdown
Collaborator

@ArronHZG ArronHZG left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLMServerManager

max_cache_size=DEFAULT_ROUTING_CACHE_SIZE,
)

def get_client(self, fully_async: bool = False) -> LLMServerClient:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this implementation should be fine, but it doesn't feel very elegant. Later, I might change it to pass in a client class and initialize it here.

Copy link
Copy Markdown
Collaborator Author

@wuxibin89 wuxibin89 Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I noticed that 5900 add an additional model_engine_server_handle to FullyAsyncLLMServerManager. We may need to pass in subclass with additional kwargs in get_client.

PeterSH6
PeterSH6 previously approved these changes Apr 28, 2026
Copy link
Copy Markdown
Collaborator

@PeterSH6 PeterSH6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. What's the plan of the old asyncllmservermanager?

@wuxibin89
Copy link
Copy Markdown
Collaborator Author

LGTM. What's the plan of the old asyncllmservermanager?

The old AsyncLLMServerManager has been replaced by LLMServerClient

@wuxibin89 wuxibin89 merged commit 3c5f6e0 into verl-project:main Apr 29, 2026
86 of 95 checks passed
Begunner added a commit to Begunner/verl that referenced this pull request Apr 30, 2026
Resolve conflicts in verl/experimental/agent_loop/agent_loop.py introduced by
PR verl-project#6129 (refactor: move LLMServerManager out of AgentLoopManager):

  * Imports - keep the function_tool import while accepting main's removal
    of prometheus_utils, teacher_loop, single_controller.ray.base imports.
  * AgentLoopWorker.__init__ - keep both the new "Online policy distillation"
    block (from main) and the "Load function-based tools once per worker"
    block (from this PR); ordering is irrelevant since they touch disjoint
    state.

The function_tools=FunctionToolListWrap(self.function_tools) kwarg in
_run_agent_loop auto-merged cleanly next to main's renamed
server_manager=self.llm_client.

Co-authored-by: Claude
Made-with: Cursor
xiefan46 added a commit to xiefan46/verl that referenced this pull request Apr 30, 2026
SamitHuang added a commit to SamitHuang/verl-omni that referenced this pull request Apr 30, 2026
…ckaging bug

The pinned verl commit (a512e90) ships a wheel that is missing
verl/experimental/reward_loop/router/ because the upstream directory had
no __init__.py at that commit and setuptools' default package discovery
silently drops it. This breaks the FlowGRPO trainer at runtime with
"ModuleNotFoundError: No module named 'verl.experimental.reward_loop.router'".

Switch the verl install in docs/start/install.md from a wheel install
(uv pip install git+…@<commit>) to a clone-and-editable install pinned
at the same commit. An editable install exposes the source tree on
sys.path, so router/ is picked up as a PEP 420 implicit namespace
package and the import works without any per-venv patching.

CI workflows are intentionally not touched because they don't exercise
the broken codepath. The pin will be bumped past
verl-project/verl#5209 once verl-omni is also adapted to the breaking
LLMServerClient refactor in verl-project/verl#6129 (tracked separately).
SamitHuang added a commit to SamitHuang/verl-omni that referenced this pull request May 1, 2026
Adapt verl-omni's diffusion agent loop and ray trainer to verl-project/verl#6129,
which removed AsyncLLMServerManager and made AgentLoopManager / AgentLoopWorker
consume an LLMServerClient produced by a separately-owned LLMServerManager.

verl-omni changes:
- DiffusionAgentLoopWorker.__init__ now takes (config, llm_client, teacher_client,
  reward_loop_worker_handles), matching the positional contract that
  AgentLoopManager.create() uses when spawning workers. _get_rollout_and_model_config
  was also dropped upstream, so the config slicing is inlined to keep the diff
  minimal.
- ray_diffusion_trainer now creates an LLMServerManager first, hands its client to
  AgentLoopManager.create(), and uses llm_server_manager.get_replicas() (instead of
  async_rollout_manager.rollout_replicas) to wire the CheckpointEngineManager. This
  mirrors the new pattern in upstream verl/trainer/ppo/ray_trainer.py.
- tests/agent_loop/test_diffusion_agent_loop.py is updated for the new API; in
  standalone test mode LLMServerManager spins up its own replicas via
  rollout.nnodes / n_gpus_per_node.

Pin / docs / CI:
- Bump the pinned verl commit to a4351480 (the merge commit of #5209), which is
  the first commit that ships verl/experimental/reward_loop/router/ in the wheel
  AND contains the #6129 refactor that this change adapts to. With this commit,
  the workaround in PR verl-project#51 (clone + editable install) is no longer required.
- Restore the simple `uv pip install git+...@<commit>` install line in
  docs/start/install.md.
- Bump the same pin in .github/workflows/{cpu_unit_tests,sanity,type-coverage-check}.yml.

This is a BREAKING change because DiffusionAgentLoopWorker.__init__ signature changed.
Any downstream code that subclasses or directly instantiates DiffusionAgentLoopWorker
must switch from (servers, load_balancer_handle, teacher_servers, teacher_load_balancer_handle)
to (llm_client, teacher_client). No public CLI/config surface is affected.

Signed-off-by: samithuang <285365963@qq.com>
SamitHuang added a commit to verl-project/verl-omni that referenced this pull request May 1, 2026
* [BREAKING][rollout] feat: adapt to verl LLMServerClient refactor

Adapt verl-omni's diffusion agent loop and ray trainer to verl-project/verl#6129,
which removed AsyncLLMServerManager and made AgentLoopManager / AgentLoopWorker
consume an LLMServerClient produced by a separately-owned LLMServerManager.

verl-omni changes:
- DiffusionAgentLoopWorker.__init__ now takes (config, llm_client, teacher_client,
  reward_loop_worker_handles), matching the positional contract that
  AgentLoopManager.create() uses when spawning workers. _get_rollout_and_model_config
  was also dropped upstream, so the config slicing is inlined to keep the diff
  minimal.
- ray_diffusion_trainer now creates an LLMServerManager first, hands its client to
  AgentLoopManager.create(), and uses llm_server_manager.get_replicas() (instead of
  async_rollout_manager.rollout_replicas) to wire the CheckpointEngineManager. This
  mirrors the new pattern in upstream verl/trainer/ppo/ray_trainer.py.
- tests/agent_loop/test_diffusion_agent_loop.py is updated for the new API; in
  standalone test mode LLMServerManager spins up its own replicas via
  rollout.nnodes / n_gpus_per_node.

Pin / docs / CI:
- Bump the pinned verl commit to a4351480 (the merge commit of #5209), which is
  the first commit that ships verl/experimental/reward_loop/router/ in the wheel
  AND contains the #6129 refactor that this change adapts to. With this commit,
  the workaround in PR #51 (clone + editable install) is no longer required.
- Restore the simple `uv pip install git+...@<commit>` install line in
  docs/start/install.md.
- Bump the same pin in .github/workflows/{cpu_unit_tests,sanity,type-coverage-check}.yml.

This is a BREAKING change because DiffusionAgentLoopWorker.__init__ signature changed.
Any downstream code that subclasses or directly instantiates DiffusionAgentLoopWorker
must switch from (servers, load_balancer_handle, teacher_servers, teacher_load_balancer_handle)
to (llm_client, teacher_client). No public CLI/config surface is affected.

Signed-off-by: samithuang <285365963@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants