Skip to content

feat: add UV_LOOP_INTERRUPT_ON_IO_CHANGE option to uv_loop_configure#3308

Closed
zcbenz wants to merge 2 commits into
libuv:v1.xfrom
zcbenz:uv-loop-option-interrupt
Closed

feat: add UV_LOOP_INTERRUPT_ON_IO_CHANGE option to uv_loop_configure#3308
zcbenz wants to merge 2 commits into
libuv:v1.xfrom
zcbenz:uv-loop-option-interrupt

Conversation

@zcbenz

@zcbenz zcbenz commented Sep 17, 2021

Copy link
Copy Markdown
Contributor

Close #3101.

Refs #3299, #3234, #1921, #1565, #1568.

When integrating an uv event loop with other event loops, we usually poll the backend fd in a separate thread and do uv_run whenever there is an event triggered. However in libuv some types of changes to handles do not trigger events on the backend fd (for example starting new timers), and the polling of backend fd will still use old timeout and old event flags, and this would cause problem like, Node's setTimeout not working in Electron.

This PR adds a UV_LOOP_INTERRUPT_ON_IO_CHANGE option to uv_loop_configure that, with it the loop will be interrupted whenever a new IO event has been added or changed, and the embedders will get a chance to restart the polling of the backend.

Electron has been using this technique for years and it has been working well.

/cc @bnoordhuis @vtjnash

@zcbenz zcbenz force-pushed the uv-loop-option-interrupt branch 2 times, most recently from 5d12f3c to 0e1889c Compare September 17, 2021 02:05
@stale

stale Bot commented Oct 30, 2021

Copy link
Copy Markdown

This issue has been automatically marked as stale because it has not had recent activity. Thank you for your contributions.

@stale stale Bot added the stale label Oct 30, 2021
@zcbenz

zcbenz commented Oct 31, 2021

Copy link
Copy Markdown
Contributor Author

Bumping to remove the stale label.

@codebytere

Copy link
Copy Markdown
Contributor

Perhaps @cjihrig or @santigimeno might have thoughts as well!

@codebytere

Copy link
Copy Markdown
Contributor

@bnoordhuis @vtjnash may also have thoughts?

@bnoordhuis bnoordhuis left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM if it works for Electron. I posted some suggestions I'll apply.

Comment thread test/test-embed.c Outdated
Comment thread test/test-embed.c Outdated
Comment thread test/test-embed.c Outdated
Comment thread test/test-embed.c Outdated
Comment thread test/test-embed.c Outdated
Comment thread test/test-embed.c Outdated
Comment thread test/test-embed.c Outdated
Comment thread test/test-embed.c Outdated
Comment thread test/test-embed.c Outdated
@bnoordhuis

Copy link
Copy Markdown
Member

Failing on Windows here:

assert(!(handle->flags & UV_HANDLE_CLOSING));

@zcbenz zcbenz force-pushed the uv-loop-option-interrupt branch 2 times, most recently from 0c9184d to 805ef15 Compare November 18, 2021 11:47
@zcbenz

zcbenz commented Nov 18, 2021

Copy link
Copy Markdown
Contributor Author

Thanks for the review, it turns out to be caused by difference between 1.x and 2.x: in 1.x the uv_backend_fd returns -1 on Windows, while in 2.x it returns the iocp handle correctly. This PR was originally based on 2.x when I started working on it.

@bnoordhuis

Copy link
Copy Markdown
Member

The failing test was probably a flake but I'll rerun CI just in case. For posterity:

 not ok 54 - fs_event_watch_dir_short_path
# exit code 3
# Output from process `fs_event_watch_dir_short_path`:
# Assertion failed in D:\a\libuv\libuv\test\test-fs-event.c on line 139: strcmp(filename, "file1") == 0

@XeCycle

XeCycle commented Dec 7, 2021

Copy link
Copy Markdown
Contributor

How does this patch behave in the special use case described in #3234 (comment)? Haven't tested, but looking at the code seems like this patch will still run like a busy poll.

@XeCycle

XeCycle commented Dec 7, 2021

Copy link
Copy Markdown
Contributor

Replying #3299 (comment) here:

You described an embedding setup where you poll uv_backend_fd(inner_loop) in another thread; let me call this thread B. And you have a thread A that calls uv_run. Due to libuv's thread safety design, all manipulations to inner_loop must happen on thread A, including adding/removing timers; I can think of 3 cases:

  • you can change thread A's loop code, and it works like a typical async-IO loop that frequently goes to sleep: then you can re-check uv_backend_timeout(inner_loop) just before going to sleep, and interrupt thread B here.
  • you can change thread A's loop code, but it is itself a busy loop that rarely goes to sleep: I suppose such code would need to re-check some external conditions to stop itself, just check timeout here.
  • you cannot change thread A's loop code: well, this patch would work as a last resort. Otherwise I'm not convinced this is a better approach than core: uv_backend_timeout should return 0 when has pending watchers #3234.

A ping-pong to thread B is of course a lot more heavyweight than polling directly in thread A; if this is possible I think #3234 uses less syscalls, if that busy polling behavior is not a problem.

@zcbenz

zcbenz commented Dec 8, 2021

Copy link
Copy Markdown
Contributor Author

@XeCycle I don't 100% understand your use case so I can't comment much on it. This PR means to fix the problem in #3101, and in this PR I have added a test that should be able to describe the problem in actual code.

If you could write a libuv test demonstrating how you are implementing loop integration and what problem you are trying to solve, the discussion would be easier. (And I think it would be required anyway if you want to get a significant change merged)

Back to your question, first I think it is important to understand that libuv is an event loop library that means to use as less CPU as possible, even though it involves using heavy operations is not a problem (for example, some async fs APIs are implemented with thread pools). So introducing busy loop into libuv is not desirable, and the loop integration used in test/test-embed.c is accepted. I'm sure some use cases are fine with busy loops, but lots are not, for example you surely don't want to have all Electron apps like Slack or VS Code taking 100% CPU on your computer.

If you think there is a better way to fix #3101 without this API, it would be great if you can write a proof of concept similar to test/test-embed.c. And if you think this PR would introduce busy loop, I would be very thankful if you can show it in code.

@XeCycle

XeCycle commented Dec 8, 2021

Copy link
Copy Markdown
Contributor

@zcbenz please checkout examples at XeCycle@d9023a7. I based this commit on my #3234. They are Linux-only, and the "outer loop" which embeds the inner libuv loop is a bare-bones minimal event loop using directly poll(2).

  • embed-example.c is my example of a typical embedding usage; it correctly prints and exits after 1 second, however if you rebase the commit to b6d51dc (which is the parent of my core: uv_backend_timeout should return 0 when has pending watchers #3234), the program sleeps forever without any output.
  • embed-busy-app.c is my example of an application that runs fine (that is, does not busy-loop; it does nothing after all) when driven by UV_RUN_DEFAULT (as is done in embed-busy-noembed.c), but becomes a busy loop when embedded (in embed-busy-embed.c).

I'm not sure I understood 100% your patch, would you like to take the trouble to solve the busy-looping of embed-busy-app.c with your patch?

@zcbenz

zcbenz commented Dec 9, 2021

Copy link
Copy Markdown
Contributor Author

Thanks for the example code, it helps very much understanding the problem, and I think it is a better way to integrate event loops on Linux.

However in our use case, we are integrating libuv loop with native GUI message loops, and unlike Linux that almost everything is a fd, on Windows the GUI message loop can not run together with IO message loop, and polling the message loops in separate threads is the only option we have.

(On Windows GUI message loop can not be used to listen to events from IOCP, and IOCP APIs can not be used to poll GUI messages.)

Also in our project the message loops are created and run by upstream projects (Node.js and Chromium), and we would like to avoid patching the projects when possible, and running message loops in separate threads can perfectly achieve this purpose.

So this is why we are doing loop integration via a separate thread.


Regarding your use case in embed-example.c, if you rebase the commit on this PR, and call uv_loop_configure(&inner_loop, UV_LOOP_INTERRUPT_ON_IO_CHANGE); after inner_app_initial_startup(&inner_loop);, it will work as expected.

And unfortunately #3234 does not fix our problem in test/test-embed.c: changing the return value of uv_backend_timeout does not interrupt existing polling, so the polling in separate thread will still be waiting with the old timeout.


Also as your example in embed-busy-embed shows, if there is code starting poll in prepare and stopping in check, both this API and #3234 will run into busy loop. I think your concern is valid and I don't have a solution to avoid that.

But this API has to be explicitly called by users to kick in, so if a user has read the documentation of this API and explicitly called this API, I think it is also user's responsibility to avoid doing the above operation. There are lots of ways to get into busy loop with official libuv APIs, as long as the document is clear I think it should be fine. For example it is simple to get into busy loop with uv_timer_start, but it is user's responsibility to avoid that.

@XeCycle

XeCycle commented Dec 9, 2021

Copy link
Copy Markdown
Contributor

on Windows the GUI message loop can not run together with IO message loop, and polling the message loops in separate threads is the only option we have.

Thanks for your explanation, I'm less experienced with Windows. Given this solid reason to do polling in separate threads, I welcome wholeheartedly inclusion of your patch.

And unfortunately #3234 does not fix our problem in test/test-embed.c: changing the return value of uv_backend_timeout does not interrupt existing polling, so the polling in separate thread will still be waiting with the old timeout.

Right, and one more word, it is limited to the case when you are not in control of the outer loop and had resorted to polling in another thread. I understand it's a valid use case worth supporting, and my patch does not fix it.

As for the busy-loop problem, I admit my case is a synthetic example that is not backed by any real world use cases; but technically it is affected by embedding. Suppose a user has a large code base previously driven by UV_RUN_DEFAULT and want to integrate with another loop, there could be corner cases that worked before suddenly become busy loops. But as you said, Electron has been doing this for years, so I think it is safe to assume such cases did not occur in practice? I guessed it should be rare but I had no statistics to back my claim.

And if we declare such a busy loop is not a problem, may I ask maintainers to give some directions on bringing forward my patch? I think it's a better option to support single-threaded embedding, although not perfect. IMHO a perfect fix would need to fundamentally change our execution order in a loop iteration, and that would break a lot of things so not realistic to be brought into v1.x.

@stale

stale Bot commented Jan 9, 2022

Copy link
Copy Markdown

This issue has been automatically marked as stale because it has not had recent activity. Thank you for your contributions.

@stale stale Bot added the stale label Jan 9, 2022
@codebytere

Copy link
Copy Markdown
Contributor

@zcbenz are you still interested in this?

@stale stale Bot removed the stale label Jan 17, 2022
@zcbenz

zcbenz commented Jan 17, 2022

Copy link
Copy Markdown
Contributor Author

Yes, and I think we don't have any remaining issue before merging this.

@codebytere

codebytere commented Feb 1, 2022

Copy link
Copy Markdown
Contributor

@bnoordhuis is there anything preventing this from being merged? We'd really like to move it forward.

@bnoordhuis

Copy link
Copy Markdown
Member

It's a new API and non-trivial so it should get at least one more sign-off from a maintainer.

Comment thread docs/src/loop.rst Outdated
Comment thread docs/src/loop.rst Outdated
Comment thread src/uv-common.h Outdated
@zcbenz zcbenz force-pushed the uv-loop-option-interrupt branch from 63b426f to 6c2605e Compare August 15, 2022 00:59
@stale stale Bot removed the stale label Aug 15, 2022
@stale

stale Bot commented Sep 21, 2022

Copy link
Copy Markdown

This issue has been automatically marked as stale because it has not had recent activity. Thank you for your contributions.

@stale stale Bot added the stale label Sep 21, 2022
@zcbenz zcbenz force-pushed the uv-loop-option-interrupt branch from 6c2605e to 0494975 Compare September 28, 2022 07:32
@stale stale Bot removed the stale label Sep 28, 2022
@cztomsik

cztomsik commented Oct 23, 2022

Copy link
Copy Markdown

@zcbenz FYI I've been trying to do a similar thing but I've figured out a little different approach which does not need any changes in libuv but it's still arguably a hack I think.
https://github.com/cztomsik/graffiti/blob/master/src/napi.zig#L90

in general, I thought I could do wait+re-render in the idle task but that forces timeout to be zero, so I've splitted the whole process into 2 parts - prepare which reads the timeout, waits for GUI events and handles them, and then I start the idle task, which re-renders everything and stops itself again (this is important because then I can read correct timeout again)

to read nodejs events I read backend_fd - it's not working yet because I need to add a semaphore there but I think it should generally work because the timeout can only change during prepare task and I know about that so I can tell awaker thread how long it needs to wait.

I mean, I'm not saying it's a better solution but maybe it's interesting to know. And maybe we can figure out something even better.

@vtjnash

vtjnash commented Oct 23, 2022

Copy link
Copy Markdown
Member

I believe this is fixed now, so this PR is unnecessary

@vtjnash vtjnash closed this Oct 23, 2022
@cztomsik

Copy link
Copy Markdown

I believe this is fixed now, so this PR is unnecessary

The hack is still in the electron patches directory and there was a push in this branch 26 days ago...
https://github.com/electron/electron/blob/3aed596fbac6b04984ee2ff50f6706884d324e71/patches/node/feat_add_uv_loop_interrupt_on_io_change_option_to_uv_loop_configure.patch

@vtjnash

vtjnash commented Oct 24, 2022

Copy link
Copy Markdown
Member

Electron tries to read uv_backend_timeout on a background thread, which adds a fair amount complexity and may reduce performance. I am a bit suspicious whether their code would pass under ThreadSanitizer analysis. This function is usually meant to be used on the primary thread only, where it it can be used to set a timer/timeout in the embedder's event loop. Alternatively, they could query the value upon return from uv_run back to the embedder's loop, and then interrupt the backend_fd polling themselves to push the new timeout to it. That would avoid the data-race issues it looks like their current design might have. Though it would likely also have worse performance due to the excess syscalls and thread-synchronization required by this strategy.

@zcbenz

zcbenz commented Aug 22, 2023

Copy link
Copy Markdown
Contributor Author

@vtjnash Unfortunately we still need this patch to make node integration work in Electron. The special thing about Electron is that its main event loop is a GUI event loop, which almost no other libuv users do, and I think that is why you thought this issue was solved.

I have written down a detailed explanation of why this patch is needed. I understand this use case is not typical and probably only Electron needs this patch, but on the other hand the number of users running Electron apps is too large that I believe it is worth having this patch in upstream.

Why Electron has to poll libuv loop in a background thread

Electron is a GUI app, whose main thread must run the GUI event loop to keep responsive, and on Windows and macOS unfortunately the GUI event loop can not run together with IO event loop (which is used by libuv).

This is a Windows GUI event loop:

while(GetMessage(&Msg, NULL, 0, 0) > 0) {
    TranslateMessage(&Msg);
    DispatchMessage(&Msg);
}

And this is a Windows IO event loop:

while (true) {
  OVERLAPPED* overlapped;
  ULONG_PTR completionKey;
  DWORD bytesTransferred;

  GetQueuedCompletionStatus(completionPort, &bytesTransferred, &completionKey, &overlapped, INFINITE);
}

The GetMessage API can't wait for IO events, and the GetQueuedCompletionStatus API can't wait for GUI messages, so it is impossible to run both loops at the same time.

Thus in Electron, we create a polling fd in background thread to poll libuv's backend fd, and whenever there is a new IO event, the GUI event loop will be notified to run libuv event loop in the main thread to handle the IO event.

// In background thread
while (WaitForSignalFromMainThread()) {
  PollEventsInBackgroundThread();
  PostMessageToMainThread();
}
// In main thread
while(GetMessage() > 0) {
  if (IsIOMessageFromBackgroundThread()) {
    uv_run_loop_once();
    SendSignalToBackgroundThreadToContinuePolling();
  } else {
    TranslateMessage(&Msg);
    DispatchMessage(&Msg);
  }
}

Problem with background thread polling

As mentioned above, Electron creates a new fd (let's call it epoll_) to listen to IO events happened in libuv's event loop:

  int backend_fd = uv_backend_fd(uv_loop_);
  struct epoll_event ev = {0};
  ev.events = EPOLLIN;
  ev.data.fd = backend_fd;
  epoll_ctl(epoll_, EPOLL_CTL_ADD, backend_fd, &ev);

And then polls the epoll_ in background thread to get notified when there is a new IO event:

void PollEventsInBackgroundThread() {
  int timeout = uv_backend_timeout(uv_loop_);
  int r;
  do {
    struct epoll_event ev; 
    r = epoll_wait(epoll_, &ev, 1, timeout);
  } while (r == -1 && errno == EINTR);
}

(The implementation calls uv_backend_timeout in background thread does have thread safety issue as mentioned by the comment above, but that can be fixed and it is not really related to this patch).

However the epoll_wait(epoll_, timeout) running in background won't return when the backend fd of libuv watches new timer or fd.

For example, when users call setTimeout(() => { console.log('ping') }, 1000 in an Electron app, the epoll_wait(epoll_, timeout) won't return after 1000ms, because the timeout which it waits on is the old value.

Similarly, when users call fs.read('file', () => {}) in Electron, the epoll_wait(epoll_, timeout) won't return neither, because the epoll_'s watch list does not include the file being read.

How to solve the problem

To solve the problem, we force the epoll_wait(epoll_) to return whenever the backend fd's watcher list changes, by calling uv__async_send. Once the epoll_wait(epoll_, timeout) returns, the background thread will notify the main thread to run libuv loop, wait for the result, and then get a new timeout and call epoll_wait(epoll_, timeout) again, this time it will be able to watch the newly added timers and fds.

This is what this patch does, by adding a new UV_LOOP_INTERRUPT_ON_IO_CHANGE option, and other users of libuv won't be affected by default.

The performance however does not really matter in Electron, the node integration happens in the GUI thread where it is not supposed to do any heavy operations at all, so a little performance penalty is totally acceptable in exchange of full Node support.

@XeCycle

XeCycle commented Aug 22, 2023

Copy link
Copy Markdown
Contributor

What about something like this: (sorry this is pseudocode and I didn't test it, I hope you can get my idea)

void gui_loop() {
  while (wait_gui_event()) {
    process_gui_event();
    uv_run(loop, UV_RUN_NOWAIT);
    tell_io_poll_thread_to_restart(uv_backend_timeout(loop));
  }
}

I guess this should work on Linux, but it depends on that I can have a thread blocked in epoll_wait, and at the same time modify its (kernel) watch list in another thread. So if I need to call uv_run to sync uv watch list to the kernel watch list, just do it here, no need to stop the polling thread beforehand. And tell the polling thread that timeout may have changed (I can cache the target time to avoid some messaging overhead).

But, allow me to reiterate, this depends on that in epoll_wait(2):

While one thread is blocked in a call to epoll_wait(), it is possible for another thread to add a file descriptor to the waited-upon epoll instance. If the new file descriptor becomes ready, it will cause the epoll_wait() call to unblock.

I don't know how IOCP behaves, so I'm not sure whether this can work for you.

@zcbenz

zcbenz commented Aug 22, 2023

Copy link
Copy Markdown
Contributor Author

@XeCycle Thanks for the pseudocode and it totally makes sense, and I think it can solve the problem without this patch.

But with this approach tell_io_poll_thread_to_restart() will be called for every single GUI event, and the number of GUI events are huge even when users are doing simple interactions -- for example, simply moving mouse from one corner to the other will generate thousands of events -- so the vast majority of uv_run and tell_io_poll_thread_to_restart calls will have no result.

Electron apps are already famous for being resource hogs, I think having a patch in libuv to save the resources should be worthwhile.

@bnoordhuis

Copy link
Copy Markdown
Member

This is where you would use uv_async_send() to wake up the libuv thread, right? Multiple calls get coalesced into a single wake-up so performance shouldn't be too terrible.

@vtjnash

vtjnash commented Aug 22, 2023

Copy link
Copy Markdown
Member

I can have a thread blocked in epoll_wait, and at the same time modify its (kernel) watch list in another thread

Yes, but Electron is depending on that being safe anyways, otherwise it would have had a far worse data race on its hands by letting that epoll_wait call exist at the same time as libuv was running on another thread. But as you saw in epoll_wait(2), this is well-defined behavior. The backend thread does not need timeouts or interruption. That just adds spurious overhead.

But with this approach tell_io_poll_thread_to_restart() will be called for every single GUI event

This is already how it is implemented, so I fail to see why this would become an issue now: https://github.com/electron/electron/blob/e1d63794e5ce74006188ed2b7f1c35b5793caf36/shell/common/node_bindings.cc#L742

@vtjnash

vtjnash commented Aug 22, 2023

Copy link
Copy Markdown
Member

Some improvements to your pseudo-code. If you squint at the existing nodejs code, I think you will find it is already nearly identical to this, and only needs a couple tweaks:

__Atomic(bool) uv_run_needed;
void gui_loop() {
  while (wait_gui_event(uv_backend_timeout(loop))) {
    process_gui_events();
    if (uv_run_needed || uv_backend_timeout(loop) == 0) {
        // this conditional is optional, but can be good for UI performance to avoid spurious epoll_wait syscall
        // it can also be a while loop, depending on how heavily you want to prioritize uv over gui events
        uv_run(loop, UV_RUN_NOWAIT);
        uv_run_needed = false;
        std::lock_guard<>() pthread_cond_notify();
     }
  }
}
void epoll_loop() {
    while (1) {
        epoll_wait();
        uv_run_needed = true;
        gui_loop_notify();
        std::lock_guard<>() if (uv_run_needed)
            pthread_cond_wait();
    }
}

This should fix the data-race on timeout and avoids the spurious thread wakeup on timeouts. In exchange, it uses an explicit seq-cst fence (instead of relying on it being probably implied by epoll) on uv_run_needed for keeping these two threads moving in lockstep and satisfying the DRF (data race freedom) requirement.

@zcbenz

zcbenz commented Aug 24, 2023

Copy link
Copy Markdown
Contributor Author

I can have a thread blocked in epoll_wait, and at the same time modify its (kernel) watch list in another thread

Yes, but Electron is depending on that being safe anyways, otherwise it would have had a far worse data race on its hands by letting that epoll_wait call exist at the same time as libuv was running on another thread. But as you saw in epoll_wait(2), this is well-defined behavior. The backend thread does not need timeouts or interruption. That just adds spurious overhead.

After writing some tests, it seems that you are right, the backend thread does not need an explicit interruption to get notified of events of newly added fds.

But somehow this does not match what I observed in Electron: without my patch, callbacks of async APIs like fs.readFile won't be called. It could be a bug in Electron's implementation or it could be some edge cases not covered by the discussions here, I'll need more debugging to know.

But with this approach tell_io_poll_thread_to_restart() will be called for every single GUI event

This is already how it is implemented, so I fail to see why this would become an issue now: https://github.com/electron/electron/blob/e1d63794e5ce74006188ed2b7f1c35b5793caf36/shell/common/node_bindings.cc#L742

The referenced UvRunOnce method is not called for every single GUI event, it is only called when the background thread gets notified of new IO events.

I have a much simplified version of the node integration code, which should be more readable:
https://github.com/yue/yode/blob/node-18/src/node_integration.cc

And the UvRunOnce is called like this:

void NodeIntegration::WakeupMainThread() {
  PostTask([this] {
    this->UvRunOnce();
  });
}

Some improvements to your pseudo-code. If you squint at the existing nodejs code, I think you will find it is already nearly identical to this, and only needs a couple tweaks:

Thanks for the code, I think in theory it implements the event loop integration in a very clean way. But when trying to implement it in actual code, I found that the biggest problem is that the wait_gui_event(timeout) API does not exist.

For GTK apps in Linux the wait_gui_event API is g_main_context_iteration, and for Windows GUI apps it is GetMessage -- neither of them accepts a timeout parameter, which is totally understandable though since it is a very rare use case and even uv_run does not have a timeout parameter too. The CFRunLoopRunInMode API on macOS does have a timeout parameter though.

And even if all the underlying system APIs do support timeout parameter, I found that Chromium's message loop class does not support passing parameter. if I have to choose between a rather large Chromium patch and a simple libuv patch, I will have to choose the latter.

So in real code to make timeouts work I have to set a timer and keep updating it in every gui message loop iteration:

void gui_loop() {
  while (wait_gui_event()) {
    process_gui_events();
    if (uv_run_needed || uv_backend_timeout(loop) == 0) {
        ...
     }
    clearTimeout(timer);
    timer = setTimeout(uv_backend_timeout(loop));
  }
}

Or degrade gui_loop to XeCycle's version:

void gui_loop() {
  while (wait_gui_event()) {
    process_gui_events();
    uv_run(loop, UV_RUN_NOWAIT);
  }
}

But still I'm not comfortable with the above versions though, either of them has to make tremendous useless calls when we could just make necessary calls by adding a small patch in libuv.

@vtjnash

vtjnash commented Aug 24, 2023

Copy link
Copy Markdown
Member

I know Gtk supports this timeout, because I integrated it years ago this way: https://github.com/JuliaGraphics/Gtk.jl/blob/881fdce8eb5a8bce2f1a846f79ac7142177e5b2a/src/GLib/signals.jl#L299
On Windows, the version of the function with timeout support is MsgWaitForMultipleObjects(Ex)/PeekMessage

If Chromium is bad at clearTimeout, you can coalesce those. In nodejs / libuv, normally that would be done with a prepare hook that starts the necessary timeout before the poll call. I don't know the equivalent for Chromium.

@zcbenz

zcbenz commented Aug 25, 2023

Copy link
Copy Markdown
Contributor Author

Thanks for the references, there are quite a lot of new information and I'll give them a try.

liujinye-sys pushed a commit to open-vela/apps_system_libuv that referenced this pull request Jul 23, 2025
Pending work may be either (on any platform) pending_queue callbacks or
(on unix) watcher_queue handles to add to the io poll object.
Previously, we might have gotten somewhat stuck if the user caused an
event to be added to one of these in the idle or prepare callbacks, or
was embedding libuv.

Refs: libuv/libuv#3234
Refs: libuv/libuv#3101
Refs: libuv/libuv#3308
liujinye-sys pushed a commit to open-vela/apps_system_libuv that referenced this pull request Dec 16, 2025
Pending work may be either (on any platform) pending_queue callbacks or
(on unix) watcher_queue handles to add to the io poll object.
Previously, we might have gotten somewhat stuck if the user caused an
event to be added to one of these in the idle or prepare callbacks, or
was embedding libuv.

Refs: libuv/libuv#3234
Refs: libuv/libuv#3101
Refs: libuv/libuv#3308
jkleinsc added a commit to electron/electron that referenced this pull request Jun 2, 2026
Ref: libuv/libuv#3308

Co-Authored-By: Claude <noreply@anthropic.com>
jkleinsc added a commit to electron/electron that referenced this pull request Jun 3, 2026
Ref: libuv/libuv#3308

Co-Authored-By: Claude <noreply@anthropic.com>
jkleinsc added a commit to electron/electron that referenced this pull request Jun 4, 2026
Ref: libuv/libuv#3308

Co-Authored-By: Claude <noreply@anthropic.com>
jkleinsc added a commit to electron/electron that referenced this pull request Jun 5, 2026
Ref: libuv/libuv#3308

Co-Authored-By: Claude <noreply@anthropic.com>
(cherry picked from commit 31b7fb0)
jkleinsc added a commit to electron/electron that referenced this pull request Jun 8, 2026
* chore: bump node in DEPS to v24.16.0

* chore: remove upstreamed patch

Node.js restored fs patchability in the ESM loader upstream, making
the fix_lazyload_fs_in_esm_loaders_to_apply_asar_patches.patch
obsolete (the patch's exact change is now in lib/internal/modules/
esm/{load,resolve,translators}.js).

Ref: nodejs/node#62835

Co-Authored-By: Claude <noreply@anthropic.com>
(cherry picked from commit 72638db)

* fix(patch): re-add experimental_fetch member after upstream cleanup

Upstream removed the experimental_fetch field from EnvironmentOptions,
but Electron's patch still registers --experimental-fetch as a CLI
option bound to that field. Re-add the member so the option compiles.

Ref: nodejs/node#62759

Co-Authored-By: Claude <noreply@anthropic.com>
(cherry picked from commit 6fd9370)

* fix(patch): cast const away when freeing uv_cpu_info_t.model

libuv 1.52.1 typed uv_cpu_info_t.model as const char*, but uv__free
takes void*. Electron builds with -Werror,-Wincompatible-pointer-types-
discards-qualifiers, so add a cast. The memory is allocated via strdup
so the cast is safe.

Ref: nodejs/node#61829

Co-Authored-By: Claude <noreply@anthropic.com>
(cherry picked from commit 90d1009)

* fix(patch): silence sign-compare warning in sessionVarintGetSafe
Cast int nBuf to size_t when comparing with sizeof(aCopy) so the
bundled sqlite3 amalgamation compiles under -Werror,-Wsign-compare.

Ref: nodejs/node#62699

Co-Authored-By: Claude <noreply@anthropic.com>

* test: move root package.json aside in node spec runner

third_party/electron_node lives under Chromium's src/, whose package.json
("type": "module") is always an ancestor of the Node.js test tree. Upstream
assumes no package.json sits above the tests, so that ancestor changes how
test files and fixtures resolve their module type: it disables module-syntax
detection (breaking test-compile-cache-typescript-esm) and emits
MODULE_TYPELESS_PACKAGE_JSON warnings that break tests asserting clean stderr
(test-esm-detect-ambiguous, test-esm-import-meta-main-eval,
test-output-coverage-with-mock).

Move src/package.json aside for the duration of the run so the environment
matches upstream exactly, then restore it. The original is kept in a sibling
backup file so an interrupted/killed run self-heals on the next invocation
rather than leaving src/package.json missing.

Ref: Unable to locate reference

Co-Authored-By: Claude <noreply@anthropic.com>
(cherry picked from commit 29abd0c)

* fix(patch): mark test-macos-app-sandbox as flaky

The test copies the Electron binary into a standalone .app bundle and
code-signs it; under parallel suite runs this races with dyld resolving
the Electron Framework rpath and intermittently aborts (SIGABRT). It
passes reliably when run alone. Mark it flaky so flakes don't fail CI.

Ref: Unable to locate reference

Co-Authored-By: Claude <noreply@anthropic.com>
(cherry picked from commit 3853529)

* test: disable test-tls-set-default-ca-certificates-extra-override

setDefaultCACertificates() round-trips the default CA set through
BoringSSL's X509_STORE, which dedups a duplicate-subject root (DigiCert
Global Root CA) that OpenSSL keeps. The set therefore loses one cert on
re-add (149 -> 148), so the test's assertEqualCerts round-trip check
fails under Electron's BoringSSL. The sibling -recovery test is disabled
for the same reason.

Ref: nodejs/node#58822

Co-Authored-By: Claude <noreply@anthropic.com>
(cherry picked from commit 3dbc596)

* fix(patch): mark test-runner watch tests as flaky

test-run-watch-repeatedly, test-run-watch-run-duration and
test-run-watch-without-file race under parallel suite load: the watcher
fires an extra re-run before the assertion, so the expected single-run
output ("tests 1") arrives with accumulated subtests. All three pass in
isolation.

Ref: nodejs/node#44898

Co-Authored-By: Claude <noreply@anthropic.com>
(cherry picked from commit 0f9c327)

* chore(patches): update libuv const-cast patch management

Combine the Windows libuv cpu_info const-cast update into the existing
chore_cast_const_away_when_freeing_uv_cpu_info_t_model.patch and keep
a single patch-management commit for the final exported patch series.

Co-Authored-By: Claude <noreply@anthropic.com>
(cherry picked from commit c590614)

* fix(patch): add UV_LOOP_INTERRUPT_ON_IO_CHANGE option

Ref: libuv/libuv#3308

Co-Authored-By: Claude <noreply@anthropic.com>

* chore: update patches (trivial only)

Co-Authored-By: Claude <noreply@anthropic.com>

* chore: update patches

Co-Authored-By: Claude <noreply@anthropic.com>

* fix(patch): adapt GetIsolate removal for the inspector NetworkAgent

Adapts api_remove_deprecated_getisolate.patch for the inspector NetworkAgent:
adapt the GetIsolate removal to the NetworkAgent member helpers, qualify the
v8 Maybe helpers in network_agent, and drop duplicate inspector helper
definitions.

Ref: nodejs/node#62162

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(patch): guard vm contextual-store strict assertion for Chromium's V8

test-vm-global-contextual-store was added upstream in nodejs/node#62571. It
asserts that an undeclared strict-mode contextual store (`"use strict"; z = 42`)
throws a ReferenceError. Chromium's V8 removed
v8::PropertyCallbackInfo<T>::This(), so Electron's contextify setter cannot
distinguish a contextual store from an explicit `globalThis.x = v` store; it
keeps the original behavior (writing to the sandbox) so explicit global stores
in vm modules keep working (test-vm-module-referrer-realm). Guard the
strict-mode ReferenceError assertion in the new test under Electron.

Ref: nodejs/node#62571
Ref: nodejs/node#60616
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: electron-roller[bot] <84116207+electron-roller[bot]@users.noreply.github.com>
Co-authored-by: John Kleinschmidt <jkleinsc@electronjs.org>
Co-authored-by: Claude <noreply@anthropic.com>
jkleinsc added a commit to electron/electron that referenced this pull request Jun 9, 2026
* chore: bump node in DEPS to v24.16.0

* chore: remove upstreamed patch

Node.js restored fs patchability in the ESM loader upstream, making
the fix_lazyload_fs_in_esm_loaders_to_apply_asar_patches.patch
obsolete (the patch's exact change is now in lib/internal/modules/
esm/{load,resolve,translators}.js).

Ref: nodejs/node#62835

Co-Authored-By: Claude <noreply@anthropic.com>
(cherry picked from commit 72638db)
(cherry picked from commit 0409547)

* fix(patch): re-add experimental_fetch member after upstream cleanup

Upstream removed the experimental_fetch field from EnvironmentOptions,
but Electron's patch still registers --experimental-fetch as a CLI
option bound to that field. Re-add the member so the option compiles.

Ref: nodejs/node#62759

Co-Authored-By: Claude <noreply@anthropic.com>
(cherry picked from commit 6fd9370)
(cherry picked from commit 46b28d2)

* fix(patch): cast const away when freeing uv_cpu_info_t.model

libuv 1.52.1 typed uv_cpu_info_t.model as const char*, but uv__free
takes void*. Electron builds with -Werror,-Wincompatible-pointer-types-
discards-qualifiers, so add a cast. The memory is allocated via strdup
so the cast is safe.

Ref: nodejs/node#61829

Co-Authored-By: Claude <noreply@anthropic.com>
(cherry picked from commit 90d1009)
(cherry picked from commit 07827c6)

* fix(patch): silence sign-compare warning in sessionVarintGetSafe
Cast int nBuf to size_t when comparing with sizeof(aCopy) so the
bundled sqlite3 amalgamation compiles under -Werror,-Wsign-compare.

Ref: nodejs/node#62699

Co-Authored-By: Claude <noreply@anthropic.com>
(cherry picked from commit 379d109)

* test: move root package.json aside in node spec runner

third_party/electron_node lives under Chromium's src/, whose package.json
("type": "module") is always an ancestor of the Node.js test tree. Upstream
assumes no package.json sits above the tests, so that ancestor changes how
test files and fixtures resolve their module type: it disables module-syntax
detection (breaking test-compile-cache-typescript-esm) and emits
MODULE_TYPELESS_PACKAGE_JSON warnings that break tests asserting clean stderr
(test-esm-detect-ambiguous, test-esm-import-meta-main-eval,
test-output-coverage-with-mock).

Move src/package.json aside for the duration of the run so the environment
matches upstream exactly, then restore it. The original is kept in a sibling
backup file so an interrupted/killed run self-heals on the next invocation
rather than leaving src/package.json missing.

Ref: Unable to locate reference

Co-Authored-By: Claude <noreply@anthropic.com>
(cherry picked from commit 29abd0c)
(cherry picked from commit 3190c99)

* fix(patch): mark test-macos-app-sandbox as flaky

The test copies the Electron binary into a standalone .app bundle and
code-signs it; under parallel suite runs this races with dyld resolving
the Electron Framework rpath and intermittently aborts (SIGABRT). It
passes reliably when run alone. Mark it flaky so flakes don't fail CI.

Ref: Unable to locate reference

Co-Authored-By: Claude <noreply@anthropic.com>
(cherry picked from commit 3853529)
(cherry picked from commit 2dfc1ae)

* test: disable test-tls-set-default-ca-certificates-extra-override

setDefaultCACertificates() round-trips the default CA set through
BoringSSL's X509_STORE, which dedups a duplicate-subject root (DigiCert
Global Root CA) that OpenSSL keeps. The set therefore loses one cert on
re-add (149 -> 148), so the test's assertEqualCerts round-trip check
fails under Electron's BoringSSL. The sibling -recovery test is disabled
for the same reason.

Ref: nodejs/node#58822

Co-Authored-By: Claude <noreply@anthropic.com>
(cherry picked from commit 3dbc596)
(cherry picked from commit 44f4094)

* fix(patch): mark test-runner watch tests as flaky

test-run-watch-repeatedly, test-run-watch-run-duration and
test-run-watch-without-file race under parallel suite load: the watcher
fires an extra re-run before the assertion, so the expected single-run
output ("tests 1") arrives with accumulated subtests. All three pass in
isolation.

Ref: nodejs/node#44898

Co-Authored-By: Claude <noreply@anthropic.com>
(cherry picked from commit 0f9c327)
(cherry picked from commit 792e1e5)

* chore(patches): update libuv const-cast patch management

Combine the Windows libuv cpu_info const-cast update into the existing
chore_cast_const_away_when_freeing_uv_cpu_info_t_model.patch and keep
a single patch-management commit for the final exported patch series.

Co-Authored-By: Claude <noreply@anthropic.com>
(cherry picked from commit c590614)
(cherry picked from commit 0506b64)

* fix(patch): add UV_LOOP_INTERRUPT_ON_IO_CHANGE option

Ref: libuv/libuv#3308

Co-Authored-By: Claude <noreply@anthropic.com>
(cherry picked from commit 31b7fb0)

* fix(patch): update GetIsolate removal for inspector NetworkAgent refactor

Upstream refactored the inspector network agent helpers into
NetworkAgent:: methods, shifting where context->GetIsolate() appeared
and reorganizing node_contextify's PropertyDefinerCallback. Re-applied
the deprecated context->GetIsolate() -> v8::Isolate::GetCurrent()
replacements to the new structure.

Ref: nodejs/node#62162

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix: drop dangling .patches entries for non-existent patch files

The cherry-pick of the cast-const fix (343078bf01) erroneously added
four entries to patches/node/.patches that reference patch files which
were never part of the 41-x-y branch (externalpointertypetag, thinlto
symbol preservation, v8 code cache, and startup snapshot generation).
The missing files caused `e sync` to abort while reading patches.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* chore: update patches (trivial only)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* test: disable test-vm-global-contextual-store

This new upstream test asserts that a strict-mode store to an undeclared
global in a vm context throws ReferenceError. Chromium's V8 contextify
global-proxy interceptor instead creates the property, so the assertion
is fundamentally incompatible with Electron's V8 and cannot be guarded
to pass.

Ref: nodejs/node#62571

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: electron-roller[bot] <84116207+electron-roller[bot]@users.noreply.github.com>
Co-authored-by: John Kleinschmidt <jkleinsc@electronjs.org>
Co-authored-by: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

The problem with integrating libuv into another event loop.

7 participants