integration-vllm-test#2258
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2258
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit d152dac with merge base 1017c7e ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
stack-info: PR: #2258, branch: drisspg/stack/58
stack-info: PR: #2258, branch: drisspg/stack/58
stack-info: PR: #2258, branch: drisspg/stack/58
stack-info: PR: #2258, branch: drisspg/stack/58
stack-info: PR: #2258, branch: drisspg/stack/58
stack-info: PR: #2258, branch: drisspg/stack/58
stack-info: PR: #2258, branch: drisspg/stack/58
stack-info: PR: #2258, branch: drisspg/stack/58
stack-info: PR: #2258, branch: drisspg/stack/58
| os.environ["VLLM_USE_V1"] = "1" | ||
| os.environ["VLLM_ENABLE_V1_MULTIPROCESSING"] = "0" | ||
| os.environ["VLLM_TEST_STANDALONE_COMPILE"] = "1" |
There was a problem hiding this comment.
are you missing VLLM_DISABLE_COMPILE_CACHE?
can you write down for each flag about whether they are always required v.s. temporary etc.
There was a problem hiding this comment.
I made note here: #2239
You can see that VLLM_TEST_STANDALONE_COMPILE which wil lultimately be the default doesn't makes it so that we can compile subclasses
There was a problem hiding this comment.
This is a good point, I think we should actually just set this in AO integration upstream
There was a problem hiding this comment.
you mean modify these flags when people use AO?
There was a problem hiding this comment.
Yeah, in the AO integration if someone is loading an AO model we should just sset this flag I will make a PR
| print(f"Quick test - Input: {test_input}, Output: {decoded}") | ||
|
|
||
| # Save quantized model | ||
| print(f"Saving quantized model to {output_dir}...") |
There was a problem hiding this comment.
should we delete these in the end?
stack-info: PR: #2258, branch: drisspg/stack/58
stack-info: PR: #2258, branch: drisspg/stack/58
Stacked PRs:
integration-vllm-test