Real-time collaboration: Implement proof-of-concept long-polling (SSE) sync provider#74331
Real-time collaboration: Implement proof-of-concept long-polling (SSE) sync provider#74331chriszarate wants to merge 2 commits into
Conversation
|
The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the Unlinked AccountsThe following contributors have not linked their GitHub and WordPress.org accounts: @nickchomey. Contributors, please read how to link your accounts to ensure your work is properly credited in WordPress releases. If you're merging code through a pull request on GitHub, copy and paste the following into the bottom of the merge commit message. To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook. |
|
How will this handle the situation where the transient is deleted early? My initial impression is that it would stop being able to sync until a full snapshot is sent again? What if the transient storage is under pressure and being deleted frequently? Do we need to worry about that scenario?
|
Sync would continue to function, but individual peers may not have an up-to-date representation of the Yjs document state until they receive a more complete snapshot from another peer.
Great point. Because eviction is not controlled by the application, neither transients nor object cache are ideal persistence layers for sync data. Under severe pressure where sync data cannot survive longer >30s, I'd guess that syncing may cease to reliably function. This PR is really just to show that it's possible to implement a Yjs provider backed by the WordPress database (and not some other network service), and therefore provide a sync transport that works (in theory) on every WordPress installation. Instead of transients, maybe we should target a new built-in post type and manage evictions manually? Or perhaps a better idea will emerge. |
|
I like this approach a lot - SSE has been very underused (until it started to have a renaissance with LLM chatbots). (nitpick: technically longpolling is a distinct technique from SSE.)
A CPT would surely be slower than transients/object cache. Transients are definitely not ideal, but as you said they get stored in the persistent object cache if you are using one (which anyone who is having perf problems should be doing). Redis and memcache are the most popular, and can handle 100k+ operations per second. SQLite Object Cache uses SQLite, which any WP install should be able to use, and is faster than Redis for this purpose. It also uses php APCu if available, which is even faster than sqlite and is shared across php workers etc... Anyone who is hitting limits for any of these options is either on terrible hardware or should be able to solve these problems with a custom service outside of WP (because they are running a large enterprise) |
Thanks @nickchomey! Great point. I agree that SSE can lead to suprisingly responsive collaboration.
Agreed generally, but our initial goal is to ship a provider that can function, under limits, on just about any host. We can simultaneously light the path for others who may be ready to devote additional resources for a better implementation. Our first step is focusing on short-polling since SSE support is not guaranteed. Please follow along on this follow-up PR: |
|
Thanks, for the info. Though, when would SSE support not be guaranteed? Its just normal http with a different header...? Though, whether doing anything like that in a synchronous PHP environment is a good idea, is an entirely other matter. You just do CQRS - long-lived SSE connection to push things out, and then adhoc POST requests to make changes. This is the basis of Mercure. Its Golang, but is integrated with Frankenphp and meant for php-based (symfony) API Platform Or are you perhaps referring to any network/proxy complexity that might cause issues for SSE connections? |
Yes, the main obstacle is output buffering, which might be configured (and not overridable) by the server, or by any HTTP layer in between the client and the server (caching proxies, load balancers, etc.). |
What?
A proof-of-concept exploring a default Yjs provider based on long-polling (server-sent events) and state stored in the WordPress database. See #74085
Why?
The current default provider is based on WebRTC and is unreliable in certain network conditions. Making it reliable requires centralized infrastructure that is probably unattainable.
An alternative default transport could use long-polling against an internal endpoint with state stored in the WordPress database. This approach would face performance issues on medium-to-large sites but would allow users to explore collaborative editing under some protective limits (e.g., a maximum of two simultaneous collaborators). Moving beyond these protective limits would require a more robust host-provided transport such as WebSockets.
How?
HttpSseProvider/sync/v1/messages. The new provider will connect to this endpoint.EventSourceconnections to this endpoint.Limitations and considerations
HttpSseProvideropens a newEventSourceconnection for each instance. As we provide support for additional entity syncing, this will consume more and more HTTP connections, overwhelming lower-resourced hosts.EventSourceconnection, which will require separately tracking therooms andlast_message_ids for each client.Testing Instructions
Testing Instructions for Keyboard
n/a
Screenshots or screencast
sse-sync.mov