[core] Don't preload jemalloc for worker.#39446
Conversation
|
https://buildkite.com/ray-project/release-tests-pr/builds/52730#018a747a-c579-482c-8082-3c90eb54337a passes, cc @krfricke @can-anyscale did it fail consistently in the release branch>? How many times should we run to verify it is fixed? |
|
@rkooo567 Yes, both
I think as long as we are able to see success runs each for both then it's good to go :) |
|
Running lightgbm_tune_4x16 now https://buildkite.com/ray-project/release-tests-pr/builds/52784 |
|
test_advanced_4.py failure seems related |
rkooo567
left a comment
There was a problem hiding this comment.
@iycheng actually, I don't thnk I understand how this disables jemalloc only in workers. can you tell me some details?
Worker is started by raylet and raylet has LD_PRELOAD setup, so worker will have this too. Remove it will remove it from the worker. |
Some library is not compatible with jemalloc. This PR disable jemalloc for python workers.
Some library is not compatible with jemalloc. This PR disable jemalloc for python workers. Signed-off-by: Jim Thompson <jimthompson5802@gmail.com>
This reverts commit 504757e.
Some library is not compatible with jemalloc. This PR disable jemalloc for python workers. Signed-off-by: Victor <vctr.y.m@example.com>
|
It seems I cannot enable Jemalloc on worker via setting the env In other words, it seems this PR disabled Jemalloc totally for workers. |
We can set `RAY_LD_PRELOAD_ON_WORKERS` as `true` with `RAY_JEMALLOC_LIB_PATH` and `RAY_JEMALLOC_PROFILE` provided to also preload jemalloc for worker. This is a fix after ray-project#39446 Signed-off-by: Yun Tang <myasuka@live.com>
The PR #39446 disables preloading Jemalloc for workers totally. However, Jemalloc is still useful in some cases, and we could make it configurable if user setting env RAY_LD_PRELOAD as 0. The batch inference example code, using a TF model to infer the batch input of numpy's ndarray. ~~~python ds = ray.data.read_tfrecords(xxx) ds.map_batches(BatchPredictor) .map_batches(BatchPostProcessor) .write_parquet(path=output_path ~~~ I did a inference test with limited memory, and we can see the OOM counts decrease from 900+ to 700. Closes #47242 Signed-off-by: Yun Tang <myasuka@live.com>
Why are these changes needed?
Some library is not compatible with jemalloc. This PR disable jemalloc for python workers.
Related issue number
Checks
git commit -s) in this PR.scripts/format.shto lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/under thecorresponding
.rstfile.