Skip to content

fix(channel): split intermediate and init readiness channels#3504

Merged
saku3 merged 1 commit into
youki-dev:mainfrom
uran0sH:fix-initready-race
May 18, 2026
Merged

fix(channel): split intermediate and init readiness channels#3504
saku3 merged 1 commit into
youki-dev:mainfrom
uran0sH:fix-initready-race

Conversation

@uran0sH

@uran0sH uran0sH commented Apr 17, 2026

Copy link
Copy Markdown
Contributor

Description

Under high concurrency, the init process can send InitReady before the
intermediate process sends IntermediateReady. The main process used to wait
for both messages on the same receiver, so wait_for_intermediate_ready()
could receive InitReady first and fail with an unexpected message error.

Add separate main-facing receivers for intermediate-owned and init-owned
readiness messages. IntermediateReady is read from the intermediate receiver,
while InitReady is forwarded through the init receiver and remains available
until wait_for_init_ready() consumes it.

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Refactoring (no functional changes)
  • Performance improvement
  • Test updates
  • CI/CD related changes
  • Other (please describe):

Testing

  • Added new unit tests
  • Added new integration tests
  • Ran existing test suite
  • Tested manually (please provide steps): 1. clone boxlite 2. apply this patch and change the youki's version in the boxlite 3. run make test.

Related Issues

ContainerBuilder::build() fails intermittently with:
received unexpected message: InitReady, expected: IntermediateReady(0)
The error comes from libcontainer::process::channel::MainReceiver::wait_for_intermediate_ready().

Normal message flow
During container creation, libcontainer spawns two child processes:

  1. Intermediate process — sets up namespaces, cgroups, etc.
  2. Init process — the actual container payload.
    They communicate with the parent via a SOCK_SEQPACKET socket:
  3. Intermediate sends IntermediateReady(pid).
  4. Parent receives it in wait_for_intermediate_ready().
  5. Init sends InitReady.
  6. Parent receives it in wait_for_init_ready().

Root cause: parallel scheduling race
The init process is forked by the intermediate process using clone3(CLONE_PARENT). The two processes run in parallel. Under CPU pressure, the scheduler may execute the init process faster than the intermediate process. Consequently, InitReady can arrive at the parent before IntermediateReady.
wait_for_intermediate_ready() only calls recv() once. When it sees InitReady, it treats it as an unexpected message and aborts the build

Additional Context

@saku3

saku3 commented Apr 18, 2026

Copy link
Copy Markdown
Member

Thank you for the fix.

Would it be possible to share the actual reproduction steps?

I understand that the intermediate process and the init process run concurrently, but once the init process has been cloned, the intermediate process seems to have very little left to do before sending IntermediateReady, whereas the init process still appears to have a fair amount of work remaining, such as namespace setup and mount-related operations.

So I’d like to better understand under what kind of real-world scenario or environment this was observed.

As for the fix itself, I’d also like to see a unit test added for this scenario.

@uran0sH

uran0sH commented Apr 19, 2026

Copy link
Copy Markdown
Contributor Author

Thank you for the fix.

Would it be possible to share the actual reproduction steps?

I understand that the intermediate process and the init process run concurrently, but once the init process has been cloned, the intermediate process seems to have very little left to do before sending IntermediateReady, whereas the init process still appears to have a fair amount of work remaining, such as namespace setup and mount-related operations.

So I’d like to better understand under what kind of real-world scenario or environment this was observed.

As for the fix itself, I’d also like to see a unit test added for this scenario.

I run make test:integration in boxlite(https://github.com/boxlite-ai/boxlite). It will fail:

FAIL [  21.353s] boxlite::zygote_integration test_zygote_concurrent_stderr_isolation
FAIL [  20.259s] boxlite::zygote_integration test_zygote_concurrent_workdir_isolation
FAIL [  19.162s] boxlite::zygote_integration test_zygote_repeated_bursts
FAIL [  20.012s] boxlite::zygote_integration test_zygote_sequential_then_concurrent

And the reason is:

thread 'test_concurrent_exec_high_concurrency' (1656451) panicked at src/boxlite/tests/execution_shutdown.rs:1466:46:
    called `Result::unwrap()` on an `Err` value: Internal("spawn_failed: internal error: build failed: failed to create container: received unexpected message: InitReady, expected: IntermediateReady(0)")

I suspect it's because it's running inside a VM, which increases the probability of this happening.

@uran0sH uran0sH force-pushed the fix-initready-race branch from 3d0d8d4 to dc061a1 Compare April 19, 2026 10:21
@nayuta723

Copy link
Copy Markdown
Contributor

Is this an issue that can also be reproduced in runc?

@nayuta723

Copy link
Copy Markdown
Contributor

My understanding is that this issue does not occur in runc because it uses separate pipes for communication between the main process and the intermediate process, and between the main process and the init process.

In contrast, in youki’s implementation, the main receiver handles messages from both the intermediate and init processes over the same channel, which I believe is what leads to this issue.

@saku3

saku3 commented Apr 25, 2026

Copy link
Copy Markdown
Member

Thank you for the comment.

I think this PR may be acceptable as an immediate fix, assuming the tests pass.

However, I think the same ordering issue could also happen with messages other than InitReady, such as HookRequest, SetupNetworkDeviceReady, or SeccompNotify.

As nayuta723 mentioned, youki currently receives messages from both the intermediate process and the init process through the shared MainReceiver. To fix this more permanently, I think we should split the communication channels, for example by using separate intermediate-to-main and init-to-main channels.

@uran0sH

uran0sH commented Apr 25, 2026

Copy link
Copy Markdown
Contributor Author

Thank you for the comment.

I think this PR may be acceptable as an immediate fix, assuming the tests pass.

However, I think the same ordering issue could also happen with messages other than InitReady, such as HookRequest, SetupNetworkDeviceReady, or SeccompNotify.

As nayuta723 mentioned, youki currently receives messages from both the intermediate process and the init process through the shared MainReceiver. To fix this more permanently, I think we should split the communication channels, for example by using separate intermediate-to-main and init-to-main channels.

I agree with you. I can also provide a permanent fix.

@uran0sH

uran0sH commented May 15, 2026

Copy link
Copy Markdown
Contributor Author

@saku3 @nayuta723 Hi, I've submitted a new version that splits the channels. I've verified it in Boxlite's tests, but I haven't added the corresponding tests to the Youki repository yet. I thought it is difficult to design a test to reproduce this problem. Do you have any good suggestions?

@nayuta723

nayuta723 commented May 17, 2026

Copy link
Copy Markdown
Contributor

Thank you! In my opinion, we don't need to add tests for asynchronous edge cases like CPU pressure, etc. I believe that for this PR, ensuring all existing tests pass and sharing the manual verification results on Boxlite under 'Tested manually (please provide steps)' should be sufficient.

@saku3
What do you think about that?

@saku3

saku3 commented May 17, 2026

Copy link
Copy Markdown
Member

I agree with nayuta723’s suggestion.

@saku3 saku3 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that makes sense. If you have a chance, could you update both the PR title and the commit message to reflect the current implementation?

@saku3 saku3 requested a review from nayuta723 May 17, 2026 10:33
Under high concurrency, the init process can send InitReady before the intermediate process sends IntermediateReady. The main process used to wait for both messages on the same receiver, so wait_for_intermediate_ready() could receive InitReady first and fail with an unexpected message error.

Add separate main-facing receivers for intermediate-owned and init-owned readiness messages. IntermediateReady is read from the intermediate receiver, while InitReady is forwarded through the init receiver and remains available until wait_for_init_ready() consumes it.

Signed-off-by: Wenyu Huang <huangwenyuu@outlook.com>
@uran0sH uran0sH force-pushed the fix-initready-race branch from 2cae7e7 to 519c459 Compare May 17, 2026 12:46
@uran0sH uran0sH changed the title fix(channel): cache InitReady when it arrives before IntermediateReady fix(channel): split intermediate and init readiness channels May 17, 2026

@nayuta723 nayuta723 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Thanks!

@saku3 saku3 merged commit 4b2f0e0 into youki-dev:main May 18, 2026
28 checks passed
@github-actions github-actions Bot mentioned this pull request May 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants