-
-
Notifications
You must be signed in to change notification settings - Fork 3.4k
storage: Introduce storage coordination thread and infrastructure for parallel next-generation IndexedDB implementation #40661
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
High level thing: I'm not sure it makes sense to check in a million lines of test expectations, especially considering that in Servo the CI is typically the only thing that can use expectations with any kind of consistency. For now, let's keep the expectations outside the repository until the CI can use them. Question for folks working on the IndexedDB backend: Is there any way that this can land as-is? I mean, is it complete enough to replace the existing backend? |
The work on this backend has to be incremental. One option is to create a feature branch to work on. That would allow for an as-is landing onto main. |
|
If we do that, it'd be good to backport the renames/client storage thread over to main. |
I understand the concern about checking in a large number of test expectations. For local development, though, having the expectations available is quite practical and it works. The README in components/storage/ explains how they can be used effectively. That said, we can definitely explore alternative places to host or share these expectations if committing them directly to the repository isn’t desirable right now.
The goal here is to enable the redesign to move forward incrementally and collaboratively. The parallel experimental implementation doesn’t need to be complete before landing. The point is to create a structure where we can iterate on the redesign without risking regressions or blocking ongoing work on the current backend. This lets multiple people make progress in parallel while keeping Servo stable. Also, this work is not only about IndexedDB. It is part of implementing the Storage specification, especially storage buckets and providing a coordination layer for all storage endpoints. The new IndexedDB backend would be the first consumer of that coordination layer. However other storage endpoints can be implemented based on that. So that part is not meant to be behind a feature flag and can be eventually used by people and especially tested by CI. |
Should we land that in a separate PR to reduce the size of this one? I think it wouldn't be very difficult to hook it up to the current implementation as well, and localstorage should be even easier. |
|
Ok, I discussed this with Martin, and I’m now talking it through with arihant2math. We might move to a hybrid model where the client storage work continues on main and the IndexedDB next work proceeds on a branch. Thanks everyone for the quick initial feedback! |
a757769 to
68e0cd4
Compare
|
I also want to clarify that the parallel IndexedDB implementation is not meant to compete with the existing one. The goal is simply to enable experimentation with areas that are difficult to prototype in the current implementation, such as a new way for database opening/deletion, listing existing databases, and trying out alternative threading and communication models. |
|
FYI the "indexeddb-next" branch has been created. Also, can the |
Yes, I asked @mrobinson to create it, and it was set up yesterday. And agreed, we need this PR both on main and on the branch. A backport or rebasing the branch would work, but let’s wait for reviews on this PR first. |
mrobinson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, with some minor naming nits:
components/shared/storage/lib.rs
Outdated
| #[derive(Clone, Debug, Deserialize, Serialize)] | ||
| pub struct StorageThreads { | ||
| web_storage_thread: GenericSender<WebStorageThreadMsg>, | ||
| clientstorage_thread: GenericSender<ClientStorageThreadMsg>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| clientstorage_thread: GenericSender<ClientStorageThreadMsg>, | |
| client_storage_thread: GenericSender<ClientStorageThreadMsg>, |
components/shared/storage/lib.rs
Outdated
| impl StorageThreads { | ||
| pub fn new( | ||
| web_storage_thread: GenericSender<WebStorageThreadMsg>, | ||
| clientstorage_thread: GenericSender<ClientStorageThreadMsg>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| clientstorage_thread: GenericSender<ClientStorageThreadMsg>, | |
| client_storage_thread: GenericSender<ClientStorageThreadMsg>, |
components/shared/storage/lib.rs
Outdated
| ) -> StorageThreads { | ||
| StorageThreads { | ||
| web_storage_thread, | ||
| clientstorage_thread, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| clientstorage_thread, | |
| client_storage_thread, |
| use serde::{Deserialize, Serialize}; | ||
|
|
||
| #[derive(Debug, Deserialize, Serialize)] | ||
| pub enum ClientStorageThreadMsg { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some messages use "Msg" and some use "Message" in the name so let's go with "Message":
| pub enum ClientStorageThreadMsg { | |
| pub enum ClientStorageThreadMessage { |
|
Thanks, I’ll fix the naming inconsistencies shortly. |
This patch updates the argument order passed to StorageThreads to match the actual creation order (IndexedDB first, WebStorage second). It also updates Constellation to exit the threads in the same order. These changes are cosmetic but improve consistency and readability. Testing: Unit tests continue to pass. Signed-off-by: Jan Varga <jvarga@igalia.com>
This patch updates the naming of the WebStorage thread exit sender/receiver in constellation code to reflect recent renaming of Storage to WebStorage. Testing: Unit tests continue to pass. Signed-off-by: Jan Varga <jvarga@igalia.com>
This patch ensures that unit tests for the newly separated storage_traits crate are included when running `mach test-unit`. Testing: Unit tests continue to pass. Signed-off-by: Jan Varga <jvarga@igalia.com>
This patch introduces a new thread responsible for coordinating all current and future storage endpoints. The prefix "client" was chosen to make it explicit that this functionality applies to client side, locally persisted storage, and should not be confused with the localStorage API or the StorageManager interface. Testing: Unit tests continue to pass. Signed-off-by: Jan Varga <jvarga@igalia.com>
The extra suffix is not necessary, as the separation between the script side and the backend side of the IndexedDB implementation is already clear from imports (dom::indexeddb vs storage_traits::indexeddb). This patch renames indexeddb_thread.rs to indexeddb.rs and updates all imports accordingly. No functional changes are included. Testing: Unit tests continue to pass. Signed-off-by: Jan Varga <jvarga@igalia.com>
Renames components/script/indexed_db.rs to indexeddb.rs to maintain consistent naming across Servo. The codebase already uses the indexeddb form in most locations, so this change aligns indexedb.rs with the rest of the code. Testing: Unit tests continue to pass. Signed-off-by: Jan Varga <jvarga@igalia.com>
Drops the indexeddb path prefix from several imports for consistency with generated DOM bindings. This also simplifies introducing a parallel IndexedDB implementation. Testing: Unit tests continue to pass. Signed-off-by: Jan Varga <jvarga@igalia.com>
- clientstorage_thread -> client_storage_thread - ClientStorageThreadMsg -> ClientStorageThreadMessage Testing: Unit tests continue to pass. Signed-off-by: Jan Varga <jvarga@igalia.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with one comment. I do wonder at this point whether a thread will be necessary at all. Maybe the coordinator could just be a struct on the constellation. Do you plan to consolidate other threads, like IndexedDBThreadFactory, into this one? If not, I doubt another thread is necessary since running the basic storage spec logic should not take a long time. Good to keep these question in mind.
| } | ||
|
|
||
| impl ClientStorageThreadFactory for GenericSender<ClientStorageThreadMessage> { | ||
| fn new(config_dir: Option<PathBuf>) -> GenericSender<ClientStorageThreadMessage> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you return a join handle as well, and have the constellation join on it as part of shutdown? We've made quite a lot of efforts on that side in #30849 this did not include joining on all threads, but if we are adding new ones it's worth doing it if possible).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha, I guess that means that the joining is better than just sending the Exit message from constellation and synchronously waiting for a reply. Yeah, I can do that if that's the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that is the case(we didn;t do it for the various storage threads either, but we should).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that is the case(we didn;t do it for the various storage threads either, but we should).
Ok, so let me add that. Thanks!
Yeah, I’ve been thinking about that too. It seems that in the long run we won’t need the StorageThreads at all, just the ClientStorageThread. In the very long term, all storage related functionality might even move into a separate process I guess. |
I haven't seen any evidence of that, for example latency increasing for specific workflow involving the constellation. Note that adding more messages to be handled by the constellation will if anything do the opposite of starving the thread: it will ensure it does a maximum of work within it's allocated time slice, which might actually improve performance since the cpu cache could be hot for a larger amount of workflows. |
That's fair, and I agree I don't have any hard evidence for potential starvation, it's more of a reflex from Gecko, where eventually all storage related coordination had to be moved off the main thread in the parent process. That said, my gut feeling is still that storage coordination will benefit from having its own thread, and eventually its own process (with the process main thread acting as the coordination thread and spawning I/O workers, etc.). That could also help mitigate potential issues caused by running unsafe C code (like SQLite)? I don't expect the coordination thread to do a lot of cpu intensive work, and it's good that you mentioned the cpu cache. For IndexedDB (and potentially other storage endpoints), I've been thinking about a communication and threading model where the script thread has a direct channel for a given database to a dedicated backend I/O thread for that specific context, avoiding unnecessary indirection and keeping backend operations for a given database on the same thread as much as possible. This is largely inspired by what we learned in Gecko, using a generic thread pool where work can migrate between threads can hurt performance, exactly for the cpu cache reasons you mentioned (and I think we already talked about this in the past). I'll try to describe this more clearly and start drafting some initial PRs for the indexeddb-next branch. The whole idea is experimental for now. |
|
So I looked at how the join handles are implemented for some of the existing threads, and I have a rough plan:
The only question is whether this can be done in a follow-up PR. |
Sound good. |
This PR introduces a new storage coordination thread, intended to serve as the central point for managing all current and future storage endpoints in Servo.
In addition to the new coordination thread, this PR also lays the infrastructure required to develop a parallel, next-generation IndexedDB implementation under the indexeddb_next feature flag living on a separate branch.
Testing: Unit and WPT tests continue to pass