Skip to content

[core] (cgroups 12/n) Raylet will start worker processes in the application cgroup#56549

Merged
edoakes merged 205 commits intomasterfrom
irabbani/cgroups-12
Sep 24, 2025
Merged

[core] (cgroups 12/n) Raylet will start worker processes in the application cgroup#56549
edoakes merged 205 commits intomasterfrom
irabbani/cgroups-12

Conversation

@israbbani
Copy link
Copy Markdown
Contributor

@israbbani israbbani commented Sep 15, 2025

This PR stacks on #56522 .

For more details about the resource isolation project see #54703.

This PR the makes the raylet move runtime_env and dashboard agents into the system cgroup. Workers are now spawned inside the application cgroup.

It introduces the following:

  • I've added a new target raylet_cgroup_types which defines the type used all functions that need to add a process to a cgroup.
  • A new parameter is added to NodeManager, WorkerPool, AgentManager, and Process constructors. The parameter is a callback that will use the CgroupManager to add a process to the respective cgroup.
  • The callback is created in main.cc.
  • main.cc owns CgroupManager because it needs to outlive the WorkerPool.
  • process.c calls the callback after fork() in the child process so nothing else can happen in the forked process before it's moved into the correct cgroup.
  • Integration tests in python for end-to-end testing of cgroups with system and application processes moved into their respective cgroups. The tests are inside python/ray/tests/resource_isolation/test_resource_isolation_integration.py and have similar setup/teardown to the C++ integration tests introduced in [core] (cgroups 2/n) adding integration tests for the cgroup sysfs driver.  #55063.

irabbani and others added 30 commits July 24, 2025 20:39
to perform cgroup operations.

Signed-off-by: irabbani <irabbani@anyscale.com>
Signed-off-by: irabbani <irabbani@anyscale.com>
instead of clone for older kernel headers < 5.7 (which is what we have
in CI)

Signed-off-by: irabbani <irabbani@anyscale.com>
Signed-off-by: irabbani <irabbani@anyscale.com>
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Signed-off-by: Ibrahim Rabbani <israbbani@gmail.com>
Signed-off-by: irabbani <irabbani@anyscale.com>
Signed-off-by: irabbani <irabbani@anyscale.com>
Signed-off-by: irabbani <irabbani@anyscale.com>
Signed-off-by: irabbani <irabbani@anyscale.com>
Signed-off-by: irabbani <irabbani@anyscale.com>
Signed-off-by: irabbani <irabbani@anyscale.com>
Signed-off-by: irabbani <irabbani@anyscale.com>
Signed-off-by: irabbani <irabbani@anyscale.com>
Signed-off-by: irabbani <irabbani@anyscale.com>
Signed-off-by: irabbani <irabbani@anyscale.com>
Signed-off-by: irabbani <irabbani@anyscale.com>
Signed-off-by: irabbani <irabbani@anyscale.com>
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Signed-off-by: Ibrahim Rabbani <israbbani@gmail.com>
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Signed-off-by: Ibrahim Rabbani <israbbani@gmail.com>
bug
Signed-off-by: irabbani <irabbani@anyscale.com>
Signed-off-by: irabbani <irabbani@anyscale.com>
fix CI.

Signed-off-by: irabbani <irabbani@anyscale.com>
Signed-off-by: irabbani <irabbani@anyscale.com>
up
Signed-off-by: irabbani <irabbani@anyscale.com>
Signed-off-by: irabbani <irabbani@anyscale.com>
Signed-off-by: Ibrahim Rabbani <irabbani@anyscale.com>
Signed-off-by: irabbani <israbbani@gmail.com>
Signed-off-by: irabbani <israbbani@gmail.com>
cursor[bot]

This comment was marked as outdated.

israbbani and others added 3 commits September 24, 2025 17:23
Signed-off-by: irabbani <israbbani@gmail.com>
Signed-off-by: irabbani <israbbani@gmail.com>
@edoakes edoakes merged commit 6044577 into master Sep 24, 2025
5 checks passed
@edoakes edoakes deleted the irabbani/cgroups-12 branch September 24, 2025 19:54
Copy link
Copy Markdown
Contributor

@ZacAttack ZacAttack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ed beat me to it, but lgtm!

marcostephan pushed a commit to marcostephan/ray that referenced this pull request Sep 24, 2025
…n startup (ray-project#56522)

This PR stacks on ray-project#56352 .

For more details about the resource isolation project see
ray-project#54703.

This PR the makes the raylet move the system processes into the system
cgroup on startup if resource isolation is enabled.

It introduces the following
* A new raylet cli arg `--system-pids` which is a comma-separated string
of pids of system processes that are started before the raylet. As of
today, it contains
* On the head node: gcs_server, dashboard_api_server, ray client server,
monitor (autoscaler)
  * On every node (including head): process subreaper, log monitor.
* End-to-end integration tests for resource isolation with the Ray SDK
(`ray.init`) and the Ray CLI (`ray --start`)

There are a few rough edges (I've added a comment on the PR where
relevant):
1. The construction of ResourceIsolationConfig is spread across multiple
call-sites (create the object, add the object store memory, add the
system pids). The big positive of doing it this way was to fail fast on
invalid user input (in scripts.py and worker.py). I think it needs to
have at least two components: the user input (cgroup_path,
system_reserved_memory, ...) and the derived input (system_pids,
total_system_reserved_memory).
2. How to determine which processes should be moved? Right now I'm using
`self.all_processes` in `node.py`. It _should_ contain all processes
started so far, but there's no guarantee.
3. How intrusive should the integration test be? Should we count the
number of pids inside the system cgroup? (This was answered in ray-project#56549)
4. How should a user setup multiple nodes on the same VM? I haven't
written an integration test for it yet because there are multiple
options for how to set this up.

---------

Signed-off-by: irabbani <israbbani@gmail.com>
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Signed-off-by: Marco Stephan <marco@magic.dev>
marcostephan pushed a commit to marcostephan/ray that referenced this pull request Sep 24, 2025
…cation cgroup (ray-project#56549)

This PR stacks on ray-project#56522 .

For more details about the resource isolation project see
ray-project#54703.

This PR the makes the raylet move runtime_env and dashboard agents into
the system cgroup. Workers are now spawned inside the application
cgroup.

It introduces the following:
* I've added a new target `raylet_cgroup_types` which defines the type
used all functions that need to add a process to a cgroup.
* A new parameter is added to `NodeManager`, `WorkerPool`,
`AgentManager`, and `Process` constructors. The parameter is a callback
that will use the CgroupManager to add a process to the respective
cgroup.
* The callback is created in `main.cc`.
* `main.cc` owns CgroupManager because it needs to outlive the
`WorkerPool`.
* `process.c` calls the callback after fork() in the child process so
nothing else can happen in the forked process before it's moved into the
correct cgroup.
* Integration tests in python for end-to-end testing of cgroups with
system and application processes moved into their respective cgroups.
The tests are inside
`python/ray/tests/resource_isolation/test_resource_isolation_integration.py`
and have similar setup/teardown to the C++ integration tests introduced
in ray-project#55063.

---------

Signed-off-by: Ibrahim Rabbani <irabbani@anyscale.com>
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Signed-off-by: Marco Stephan <marco@magic.dev>
edoakes pushed a commit that referenced this pull request Sep 25, 2025
This PR stacks on #56549.

For more details about the resource isolation project see
#54703.

This PR deletes the old cgroup code inside `/src/ray/common/cgroup`
along with all associated comments.

---------

Signed-off-by: irabbani <israbbani@gmail.com>
edoakes added a commit to edoakes/ray that referenced this pull request Sep 26, 2025
elliot-barn pushed a commit that referenced this pull request Sep 27, 2025
…n startup (#56522)

This PR stacks on #56352 .

For more details about the resource isolation project see
#54703.

This PR the makes the raylet move the system processes into the system
cgroup on startup if resource isolation is enabled.

It introduces the following
* A new raylet cli arg `--system-pids` which is a comma-separated string
of pids of system processes that are started before the raylet. As of
today, it contains
* On the head node: gcs_server, dashboard_api_server, ray client server,
monitor (autoscaler)
  * On every node (including head): process subreaper, log monitor.
* End-to-end integration tests for resource isolation with the Ray SDK
(`ray.init`) and the Ray CLI (`ray --start`)

There are a few rough edges (I've added a comment on the PR where
relevant):
1. The construction of ResourceIsolationConfig is spread across multiple
call-sites (create the object, add the object store memory, add the
system pids). The big positive of doing it this way was to fail fast on
invalid user input (in scripts.py and worker.py). I think it needs to
have at least two components: the user input (cgroup_path,
system_reserved_memory, ...) and the derived input (system_pids,
total_system_reserved_memory).
2. How to determine which processes should be moved? Right now I'm using
`self.all_processes` in `node.py`. It _should_ contain all processes
started so far, but there's no guarantee.
3. How intrusive should the integration test be? Should we count the
number of pids inside the system cgroup? (This was answered in #56549)
4. How should a user setup multiple nodes on the same VM? I haven't
written an integration test for it yet because there are multiple
options for how to set this up.

---------

Signed-off-by: irabbani <israbbani@gmail.com>
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
elliot-barn pushed a commit that referenced this pull request Sep 27, 2025
…cation cgroup (#56549)

This PR stacks on #56522 .

For more details about the resource isolation project see
#54703.

This PR the makes the raylet move runtime_env and dashboard agents into
the system cgroup. Workers are now spawned inside the application
cgroup.

It introduces the following:
* I've added a new target `raylet_cgroup_types` which defines the type
used all functions that need to add a process to a cgroup.
* A new parameter is added to `NodeManager`, `WorkerPool`,
`AgentManager`, and `Process` constructors. The parameter is a callback
that will use the CgroupManager to add a process to the respective
cgroup.
* The callback is created in `main.cc`.
* `main.cc` owns CgroupManager because it needs to outlive the
`WorkerPool`.
* `process.c` calls the callback after fork() in the child process so
nothing else can happen in the forked process before it's moved into the
correct cgroup.
* Integration tests in python for end-to-end testing of cgroups with
system and application processes moved into their respective cgroups.
The tests are inside
`python/ray/tests/resource_isolation/test_resource_isolation_integration.py`
and have similar setup/teardown to the C++ integration tests introduced
in #55063.

---------

Signed-off-by: Ibrahim Rabbani <irabbani@anyscale.com>
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
elliot-barn pushed a commit that referenced this pull request Sep 27, 2025
This PR stacks on #56549.

For more details about the resource isolation project see
#54703.

This PR deletes the old cgroup code inside `/src/ray/common/cgroup`
along with all associated comments.

---------

Signed-off-by: irabbani <israbbani@gmail.com>
Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
dstrodtman pushed a commit to dstrodtman/ray that referenced this pull request Oct 6, 2025
…n startup (ray-project#56522)

This PR stacks on ray-project#56352 .

For more details about the resource isolation project see
ray-project#54703.

This PR the makes the raylet move the system processes into the system
cgroup on startup if resource isolation is enabled.

It introduces the following
* A new raylet cli arg `--system-pids` which is a comma-separated string
of pids of system processes that are started before the raylet. As of
today, it contains
* On the head node: gcs_server, dashboard_api_server, ray client server,
monitor (autoscaler)
  * On every node (including head): process subreaper, log monitor.
* End-to-end integration tests for resource isolation with the Ray SDK
(`ray.init`) and the Ray CLI (`ray --start`)

There are a few rough edges (I've added a comment on the PR where
relevant):
1. The construction of ResourceIsolationConfig is spread across multiple
call-sites (create the object, add the object store memory, add the
system pids). The big positive of doing it this way was to fail fast on
invalid user input (in scripts.py and worker.py). I think it needs to
have at least two components: the user input (cgroup_path,
system_reserved_memory, ...) and the derived input (system_pids,
total_system_reserved_memory).
2. How to determine which processes should be moved? Right now I'm using
`self.all_processes` in `node.py`. It _should_ contain all processes
started so far, but there's no guarantee.
3. How intrusive should the integration test be? Should we count the
number of pids inside the system cgroup? (This was answered in ray-project#56549)
4. How should a user setup multiple nodes on the same VM? I haven't
written an integration test for it yet because there are multiple
options for how to set this up.

---------

Signed-off-by: irabbani <israbbani@gmail.com>
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Signed-off-by: Douglas Strodtman <douglas@anyscale.com>
dstrodtman pushed a commit that referenced this pull request Oct 6, 2025
…cation cgroup (#56549)

This PR stacks on #56522 .

For more details about the resource isolation project see
#54703.

This PR the makes the raylet move runtime_env and dashboard agents into
the system cgroup. Workers are now spawned inside the application
cgroup.

It introduces the following:
* I've added a new target `raylet_cgroup_types` which defines the type
used all functions that need to add a process to a cgroup.
* A new parameter is added to `NodeManager`, `WorkerPool`,
`AgentManager`, and `Process` constructors. The parameter is a callback
that will use the CgroupManager to add a process to the respective
cgroup.
* The callback is created in `main.cc`.
* `main.cc` owns CgroupManager because it needs to outlive the
`WorkerPool`.
* `process.c` calls the callback after fork() in the child process so
nothing else can happen in the forked process before it's moved into the
correct cgroup.
* Integration tests in python for end-to-end testing of cgroups with
system and application processes moved into their respective cgroups.
The tests are inside
`python/ray/tests/resource_isolation/test_resource_isolation_integration.py`
and have similar setup/teardown to the C++ integration tests introduced
in #55063.

---------

Signed-off-by: Ibrahim Rabbani <irabbani@anyscale.com>
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Signed-off-by: Douglas Strodtman <douglas@anyscale.com>
dstrodtman pushed a commit that referenced this pull request Oct 6, 2025
This PR stacks on #56549.

For more details about the resource isolation project see
#54703.

This PR deletes the old cgroup code inside `/src/ray/common/cgroup`
along with all associated comments.

---------

Signed-off-by: irabbani <israbbani@gmail.com>
Signed-off-by: Douglas Strodtman <douglas@anyscale.com>
justinyeh1995 pushed a commit to justinyeh1995/ray that referenced this pull request Oct 20, 2025
…n startup (ray-project#56522)

This PR stacks on ray-project#56352 .

For more details about the resource isolation project see
ray-project#54703.

This PR the makes the raylet move the system processes into the system
cgroup on startup if resource isolation is enabled.

It introduces the following
* A new raylet cli arg `--system-pids` which is a comma-separated string
of pids of system processes that are started before the raylet. As of
today, it contains
* On the head node: gcs_server, dashboard_api_server, ray client server,
monitor (autoscaler)
  * On every node (including head): process subreaper, log monitor.
* End-to-end integration tests for resource isolation with the Ray SDK
(`ray.init`) and the Ray CLI (`ray --start`)

There are a few rough edges (I've added a comment on the PR where
relevant):
1. The construction of ResourceIsolationConfig is spread across multiple
call-sites (create the object, add the object store memory, add the
system pids). The big positive of doing it this way was to fail fast on
invalid user input (in scripts.py and worker.py). I think it needs to
have at least two components: the user input (cgroup_path,
system_reserved_memory, ...) and the derived input (system_pids,
total_system_reserved_memory).
2. How to determine which processes should be moved? Right now I'm using
`self.all_processes` in `node.py`. It _should_ contain all processes
started so far, but there's no guarantee.
3. How intrusive should the integration test be? Should we count the
number of pids inside the system cgroup? (This was answered in ray-project#56549)
4. How should a user setup multiple nodes on the same VM? I haven't
written an integration test for it yet because there are multiple
options for how to set this up.

---------

Signed-off-by: irabbani <israbbani@gmail.com>
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
justinyeh1995 pushed a commit to justinyeh1995/ray that referenced this pull request Oct 20, 2025
…cation cgroup (ray-project#56549)

This PR stacks on ray-project#56522 .

For more details about the resource isolation project see
ray-project#54703.

This PR the makes the raylet move runtime_env and dashboard agents into
the system cgroup. Workers are now spawned inside the application
cgroup.

It introduces the following:
* I've added a new target `raylet_cgroup_types` which defines the type
used all functions that need to add a process to a cgroup.
* A new parameter is added to `NodeManager`, `WorkerPool`,
`AgentManager`, and `Process` constructors. The parameter is a callback
that will use the CgroupManager to add a process to the respective
cgroup.
* The callback is created in `main.cc`.
* `main.cc` owns CgroupManager because it needs to outlive the
`WorkerPool`.
* `process.c` calls the callback after fork() in the child process so
nothing else can happen in the forked process before it's moved into the
correct cgroup.
* Integration tests in python for end-to-end testing of cgroups with
system and application processes moved into their respective cgroups.
The tests are inside
`python/ray/tests/resource_isolation/test_resource_isolation_integration.py`
and have similar setup/teardown to the C++ integration tests introduced
in ray-project#55063.

---------

Signed-off-by: Ibrahim Rabbani <irabbani@anyscale.com>
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
justinyeh1995 pushed a commit to justinyeh1995/ray that referenced this pull request Oct 20, 2025
…t#56909)

This PR stacks on ray-project#56549.

For more details about the resource isolation project see
ray-project#54703.

This PR deletes the old cgroup code inside `/src/ray/common/cgroup`
along with all associated comments.

---------

Signed-off-by: irabbani <israbbani@gmail.com>
landscapepainter pushed a commit to landscapepainter/ray that referenced this pull request Nov 17, 2025
…n startup (ray-project#56522)

This PR stacks on ray-project#56352 .

For more details about the resource isolation project see
ray-project#54703.

This PR the makes the raylet move the system processes into the system
cgroup on startup if resource isolation is enabled.

It introduces the following
* A new raylet cli arg `--system-pids` which is a comma-separated string
of pids of system processes that are started before the raylet. As of
today, it contains
* On the head node: gcs_server, dashboard_api_server, ray client server,
monitor (autoscaler)
  * On every node (including head): process subreaper, log monitor.
* End-to-end integration tests for resource isolation with the Ray SDK
(`ray.init`) and the Ray CLI (`ray --start`)

There are a few rough edges (I've added a comment on the PR where
relevant):
1. The construction of ResourceIsolationConfig is spread across multiple
call-sites (create the object, add the object store memory, add the
system pids). The big positive of doing it this way was to fail fast on
invalid user input (in scripts.py and worker.py). I think it needs to
have at least two components: the user input (cgroup_path,
system_reserved_memory, ...) and the derived input (system_pids,
total_system_reserved_memory).
2. How to determine which processes should be moved? Right now I'm using
`self.all_processes` in `node.py`. It _should_ contain all processes
started so far, but there's no guarantee.
3. How intrusive should the integration test be? Should we count the
number of pids inside the system cgroup? (This was answered in ray-project#56549)
4. How should a user setup multiple nodes on the same VM? I haven't
written an integration test for it yet because there are multiple
options for how to set this up.

---------

Signed-off-by: irabbani <israbbani@gmail.com>
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
landscapepainter pushed a commit to landscapepainter/ray that referenced this pull request Nov 17, 2025
…cation cgroup (ray-project#56549)

This PR stacks on ray-project#56522 .

For more details about the resource isolation project see
ray-project#54703.

This PR the makes the raylet move runtime_env and dashboard agents into
the system cgroup. Workers are now spawned inside the application
cgroup.

It introduces the following:
* I've added a new target `raylet_cgroup_types` which defines the type
used all functions that need to add a process to a cgroup.
* A new parameter is added to `NodeManager`, `WorkerPool`,
`AgentManager`, and `Process` constructors. The parameter is a callback
that will use the CgroupManager to add a process to the respective
cgroup.
* The callback is created in `main.cc`.
* `main.cc` owns CgroupManager because it needs to outlive the
`WorkerPool`.
* `process.c` calls the callback after fork() in the child process so
nothing else can happen in the forked process before it's moved into the
correct cgroup.
* Integration tests in python for end-to-end testing of cgroups with
system and application processes moved into their respective cgroups.
The tests are inside
`python/ray/tests/resource_isolation/test_resource_isolation_integration.py`
and have similar setup/teardown to the C++ integration tests introduced
in ray-project#55063.

---------

Signed-off-by: Ibrahim Rabbani <irabbani@anyscale.com>
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
landscapepainter pushed a commit to landscapepainter/ray that referenced this pull request Nov 17, 2025
…t#56909)

This PR stacks on ray-project#56549.

For more details about the resource isolation project see
ray-project#54703.

This PR deletes the old cgroup code inside `/src/ray/common/cgroup`
along with all associated comments.

---------

Signed-off-by: irabbani <israbbani@gmail.com>
Future-Outlier pushed a commit to Future-Outlier/ray that referenced this pull request Dec 7, 2025
…n startup (ray-project#56522)

This PR stacks on ray-project#56352 .

For more details about the resource isolation project see
ray-project#54703.

This PR the makes the raylet move the system processes into the system
cgroup on startup if resource isolation is enabled.

It introduces the following
* A new raylet cli arg `--system-pids` which is a comma-separated string
of pids of system processes that are started before the raylet. As of
today, it contains
* On the head node: gcs_server, dashboard_api_server, ray client server,
monitor (autoscaler)
  * On every node (including head): process subreaper, log monitor.
* End-to-end integration tests for resource isolation with the Ray SDK
(`ray.init`) and the Ray CLI (`ray --start`)

There are a few rough edges (I've added a comment on the PR where
relevant):
1. The construction of ResourceIsolationConfig is spread across multiple
call-sites (create the object, add the object store memory, add the
system pids). The big positive of doing it this way was to fail fast on
invalid user input (in scripts.py and worker.py). I think it needs to
have at least two components: the user input (cgroup_path,
system_reserved_memory, ...) and the derived input (system_pids,
total_system_reserved_memory).
2. How to determine which processes should be moved? Right now I'm using
`self.all_processes` in `node.py`. It _should_ contain all processes
started so far, but there's no guarantee.
3. How intrusive should the integration test be? Should we count the
number of pids inside the system cgroup? (This was answered in ray-project#56549)
4. How should a user setup multiple nodes on the same VM? I haven't
written an integration test for it yet because there are multiple
options for how to set this up.

---------

Signed-off-by: irabbani <israbbani@gmail.com>
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Signed-off-by: Future-Outlier <eric901201@gmail.com>
Future-Outlier pushed a commit to Future-Outlier/ray that referenced this pull request Dec 7, 2025
…cation cgroup (ray-project#56549)

This PR stacks on ray-project#56522 .

For more details about the resource isolation project see
ray-project#54703.

This PR the makes the raylet move runtime_env and dashboard agents into
the system cgroup. Workers are now spawned inside the application
cgroup.

It introduces the following:
* I've added a new target `raylet_cgroup_types` which defines the type
used all functions that need to add a process to a cgroup.
* A new parameter is added to `NodeManager`, `WorkerPool`,
`AgentManager`, and `Process` constructors. The parameter is a callback
that will use the CgroupManager to add a process to the respective
cgroup.
* The callback is created in `main.cc`.
* `main.cc` owns CgroupManager because it needs to outlive the
`WorkerPool`.
* `process.c` calls the callback after fork() in the child process so
nothing else can happen in the forked process before it's moved into the
correct cgroup.
* Integration tests in python for end-to-end testing of cgroups with
system and application processes moved into their respective cgroups.
The tests are inside
`python/ray/tests/resource_isolation/test_resource_isolation_integration.py`
and have similar setup/teardown to the C++ integration tests introduced
in ray-project#55063.

---------

Signed-off-by: Ibrahim Rabbani <irabbani@anyscale.com>
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Signed-off-by: Future-Outlier <eric901201@gmail.com>
Future-Outlier pushed a commit to Future-Outlier/ray that referenced this pull request Dec 7, 2025
…t#56909)

This PR stacks on ray-project#56549.

For more details about the resource isolation project see
ray-project#54703.

This PR deletes the old cgroup code inside `/src/ray/common/cgroup`
along with all associated comments.

---------

Signed-off-by: irabbani <israbbani@gmail.com>
Signed-off-by: Future-Outlier <eric901201@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Issues that should be addressed in Ray Core go add ONLY when ready to merge, run all tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants