Skip to content

[ray.data.llm] Fix build_llm_processor for ServeDeploymentProcessor#57061

Merged
kouroshHakha merged 6 commits intoray-project:masterfrom
jeffreyjeffreywang:build-llm-processor-fix
Oct 7, 2025
Merged

[ray.data.llm] Fix build_llm_processor for ServeDeploymentProcessor#57061
kouroshHakha merged 6 commits intoray-project:masterfrom
jeffreyjeffreywang:build-llm-processor-fix

Conversation

@jeffreyjeffreywang
Copy link
Copy Markdown
Contributor

@jeffreyjeffreywang jeffreyjeffreywang commented Sep 30, 2025

Why are these changes needed?

chat_template_kwargs was introduced in #56490 and is currently passed through all processors, even though some (e.g., ServeDeploymentProcessor and HttpRequestProcessor) do not support this field. As a result, constructing a ServeDeploymentProcessor via build_llm_processor raises errors.

This issue wasn’t caught in tests because they have been using ProcessorBuilder.build, which bypasses the layer where chat_template_kwargs is passed.

In this PR, we:

  • Introduce builder_kwargs and delegate their validation to individual processors. The chat_template_kwargs will be passed as part of builder_kwargs to the vLLMEngineProcessor.
  • Update tests to rely on public APIs rather than developer APIs, ensuring they reflect end-user usage.

Related issue number

Resolves #57060

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run pre-commit jobs to lint the changes in this PR. (pre-commit setup)
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Note

Make build_llm_processor skip chat_template_kwargs for non-offline processors and update processor tests to use public ray.data.llm APIs.

  • Core (ray.data.llm)
    • Adjust build_llm_processor to only pass chat_template_kwargs when config is an OfflineProcessorConfig.
  • Tests
    • Switch to public APIs: import processor configs from ray.data.llm and use build_llm_processor instead of ProcessorBuilder.build where applicable.
    • Update Serve Deployment and vLLM processor tests accordingly; keep behavior/assertions intact.

Written by Cursor Bugbot for commit fb36cc3. This will update automatically on new commits. Configure here.

@jeffreyjeffreywang jeffreyjeffreywang requested a review from a team as a code owner September 30, 2025 23:25
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly addresses a bug where chat_template_kwargs was passed to processor types that do not support it, leading to errors. The fix, which involves checking if the configuration is an instance of OfflineProcessorConfig before passing the argument, is appropriate. Additionally, updating the tests to use the public build_llm_processor API instead of the internal ProcessorBuilder is a great improvement for ensuring the tests reflect end-user scenarios. I've provided one suggestion to refactor the implementation for better code clarity and maintainability.

@jeffreyjeffreywang
Copy link
Copy Markdown
Contributor Author

cc: @nrghosh

Copy link
Copy Markdown
Contributor

@nrghosh nrghosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @ahao-anyscale

Copy link
Copy Markdown
Contributor

@nrghosh nrghosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ltgm pending tests

@ray-gardener ray-gardener bot added data Ray Data-related issues community-contribution Contributed by the community labels Oct 1, 2025
Copy link
Copy Markdown
Contributor

@kouroshHakha kouroshHakha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a fundamental problem found in this issue / PR that needs to be addressed properly. The gist of the issue is that we did not enforce any standard signature for builder methods. it implicitly started with config, preprocess, postprocess and for some builders we added chat_template_kwargs` that went missing by other builders that were not on the critical path. The proper fix is either: 1) enforce a strong typing / signature for the builder methods or 2) make the builder kwargs a passthrough towards the registered builder functions.

So here is my proposal for fixing this fundamentally and is more aligned with approach 2.

  1. change the signature of the public build_llm_processor to (config, preprocess, postprocess, builder_kwargs)

  2. Keep the old things as is, and pass through the builder_kwargs to the registered builder function. If the builder function does not have the matching signature it will barf during the build.

The current fix in this PR is not good, because it creates a leakage that OfflineProcessorConfigs will always have chat_template_kwargs which is again an implicit assumption.

I think the second approach is more generic and flexible.

Copy link
Copy Markdown
Contributor

@nrghosh nrghosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

taking a first pass to address comments

Copy link
Copy Markdown
Contributor

@hao-aaron hao-aaron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with kourosh, i think it makes more sense for arg checking to be done inside each builder function.

cursor[bot]

This comment was marked as outdated.

@nrghosh nrghosh force-pushed the build-llm-processor-fix branch from e7fb1b4 to 66f1517 Compare October 3, 2025 20:53
@jeffreyjeffreywang
Copy link
Copy Markdown
Contributor Author

microcheck failure doesn't seem to be relevant:

[2025-10-03T21:12:09Z] --------
  | [2025-10-03T21:12:09Z]  > [3/3] RUN <<EOF (#!/bin/bash -i...):
  | [2025-10-03T21:12:09Z] 62.85 + /ray/ci/env/../../ci/suppress_output conda install -c anaconda mpi4py -y
  | [2025-10-03T21:12:09Z] 63.08
  | [2025-10-03T21:12:09Z] 63.08 real	0m0.229s
  | [2025-10-03T21:12:09Z] 63.08 user	0m0.198s
  | [2025-10-03T21:12:09Z] 63.08 sys	0m0.030s
  | [2025-10-03T21:12:09Z] 63.08 Error while loading conda entry point: conda-libmamba-solver (cannot import name 'Spinner' from 'conda.common.io' (/opt/miniforge/lib/python3.9/site-packages/conda/common/io.py))
  | [2025-10-03T21:12:09Z] 63.08
  | [2025-10-03T21:12:09Z] 63.08 CondaValueError: You have chosen a non-default solver backend (libmamba) but it was not recognized. Choose one of: classic
  | [2025-10-03T21:12:09Z] 63.08
  | [2025-10-03T21:12:09Z] 63.08 FAILED 1
  | [2025-10-03T21:12:09Z]

@nrghosh nrghosh added go add ONLY when ready to merge, run all tests community-contribution Contributed by the community and removed community-contribution Contributed by the community labels Oct 6, 2025
Comment on lines +481 to +487
# Pass through any additional builder kwargs
if builder_kwargs is not None:
# Check for conflicts with explicitly passed arguments
reserved_keys = {"preprocess", "postprocess"}
conflicting_keys = reserved_keys & builder_kwargs.keys()
if conflicting_keys:
raise ValueError(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you create a utility on ProcessBuilder to do this validation there?

ProcessBuilder.validate_builder_kwargs(builder_kwargs)

return ProcessBuilder.build(config, preprocess, postprocess, builder_kwargs)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, done in the latest revision.

@@ -188,6 +188,110 @@ def overrider(name: str, stage: StatefulStage):
)


def test_builder_kwargs_passthrough():
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you spearate out these test scenarios into different test methods? Maybe create a Test class to capture them all and share resource between them using fixtures, etc.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, done in the latest revision.

jeffreywang-anyscale and others added 4 commits October 6, 2025 21:44
…s that don't support it

Signed-off-by: jeffreyjeffreywang <jeffjeffreywang@gmail.com>
Signed-off-by: jeffreyjeffreywang <jeffjeffreywang@gmail.com>
- use generic `builder_kwargs` passthrough in `build_llm_processor`
- replace `chat_template_kwargs` param with `builder_kwargs` dict that
passes through to the registered builder functions
- remove implicit assumptions and have builders raise `TypeError` for
unsupported kwargs
- test passthrough with conflicts

Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>
Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>
cursor[bot]

This comment was marked as outdated.

@jeffreyjeffreywang
Copy link
Copy Markdown
Contributor Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively resolves the issue of chat_template_kwargs being passed to unsupported processors by introducing a more generic builder_kwargs dictionary. This is a solid architectural improvement that enhances flexibility and robustness. The validation for reserved keys is a good safeguard, and updating the tests to use the public build_llm_processor API makes them more representative of end-user scenarios. The changes are well-implemented, and the new tests are thorough. I have one minor suggestion to simplify the code.

Signed-off-by: jeffreyjeffreywang <jeffjeffreywang@gmail.com>
Signed-off-by: jeffreyjeffreywang <jeffjeffreywang@gmail.com>
preprocess=lambda row: {"val": row["id"]},
builder_kwargs={conflicting_key: lambda row: {"other": row["id"]}},
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Test Registry Persistence Causes Flaky Errors

The ProcessorBuilder registry isn't consistently cleared between tests. test_builder() registers DummyProcessorConfig without unregistering it, leading to flaky "already registered" errors in TestBuilderKwargsValidation tests when test_builder() runs first.

Fix in Cursor Fix in Web

@kouroshHakha kouroshHakha enabled auto-merge (squash) October 7, 2025 01:59
@kouroshHakha kouroshHakha merged commit 62e1d85 into ray-project:master Oct 7, 2025
7 checks passed
liulehui pushed a commit to liulehui/ray that referenced this pull request Oct 9, 2025
…project#57061)

Signed-off-by: jeffreyjeffreywang <jeffjeffreywang@gmail.com>
Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>
Co-authored-by: jeffreyjeffreywang <jeffjeffreywang@gmail.com>
Co-authored-by: Nikhil Ghosh <nikhil@anyscale.com>
joshkodi pushed a commit to joshkodi/ray that referenced this pull request Oct 13, 2025
…project#57061)

Signed-off-by: jeffreyjeffreywang <jeffjeffreywang@gmail.com>
Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>
Co-authored-by: jeffreyjeffreywang <jeffjeffreywang@gmail.com>
Co-authored-by: Nikhil Ghosh <nikhil@anyscale.com>
Signed-off-by: Josh Kodi <joshkodi@gmail.com>
justinyeh1995 pushed a commit to justinyeh1995/ray that referenced this pull request Oct 20, 2025
…project#57061)

Signed-off-by: jeffreyjeffreywang <jeffjeffreywang@gmail.com>
Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>
Co-authored-by: jeffreyjeffreywang <jeffjeffreywang@gmail.com>
Co-authored-by: Nikhil Ghosh <nikhil@anyscale.com>
xinyuangui2 pushed a commit to xinyuangui2/ray that referenced this pull request Oct 22, 2025
…project#57061)

Signed-off-by: jeffreyjeffreywang <jeffjeffreywang@gmail.com>
Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>
Co-authored-by: jeffreyjeffreywang <jeffjeffreywang@gmail.com>
Co-authored-by: Nikhil Ghosh <nikhil@anyscale.com>
Signed-off-by: xgui <xgui@anyscale.com>
landscapepainter pushed a commit to landscapepainter/ray that referenced this pull request Nov 17, 2025
…project#57061)

Signed-off-by: jeffreyjeffreywang <jeffjeffreywang@gmail.com>
Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>
Co-authored-by: jeffreyjeffreywang <jeffjeffreywang@gmail.com>
Co-authored-by: Nikhil Ghosh <nikhil@anyscale.com>
Aydin-ab pushed a commit to Aydin-ab/ray-aydin that referenced this pull request Nov 19, 2025
…project#57061)

Signed-off-by: jeffreyjeffreywang <jeffjeffreywang@gmail.com>
Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>
Co-authored-by: jeffreyjeffreywang <jeffjeffreywang@gmail.com>
Co-authored-by: Nikhil Ghosh <nikhil@anyscale.com>
Signed-off-by: Aydin Abiar <aydin@anyscale.com>
Future-Outlier pushed a commit to Future-Outlier/ray that referenced this pull request Dec 7, 2025
…project#57061)

Signed-off-by: jeffreyjeffreywang <jeffjeffreywang@gmail.com>
Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>
Co-authored-by: jeffreyjeffreywang <jeffjeffreywang@gmail.com>
Co-authored-by: Nikhil Ghosh <nikhil@anyscale.com>
Signed-off-by: Future-Outlier <eric901201@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-contribution Contributed by the community data Ray Data-related issues go add ONLY when ready to merge, run all tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ray.data.llm] Error when constructing ServeDeploymentProcessor due to unexpected argument from build_llm_processor

6 participants