WIP [DO NOT MERGE] FSMonitor3 onto vfs-2.30.0#289
WIP [DO NOT MERGE] FSMonitor3 onto vfs-2.30.0#289jeffhostetler wants to merge 47 commits intomicrosoft:vfs-2.30.0from
Conversation
|
There are some build issues, apparently. As to vs-build, it fails with this message:
This is the well-known change in The second issue seems to be in the macOS-specific FSMonitor tests:
It would appear to me as if deleting |
|
@dscho Hey thanks for the quick look. I hadn't seen the expat issue until this morning. I think I probably had that commit in my upstream + 11th batch version. WRT the test fail on mac, look at compat/fsmonitor/macos.c. I put a TODO comment in there describing it. I'd like to talk to you about expanding the |
Yes, it's already in upstream's
I think that my idea there was to avoid false positives when a directory is replaced by a file (this can happen when you call
Good idea! |
6f3082a to
1138b55
Compare
e094618 to
2924876
Compare
abe1b70 to
359feb2
Compare
f678c3a to
e1a4c70
Compare
ae09bac to
a75bca4
Compare
Isolate and document initialization of `istate->fsmonitor_last_update`. This field should contain a fsmonitor-specific opaque token, but we need to initialize it before we can actually talk to a fsmonitor process, so we create a generic default value. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Create man page to describe the `git fsmonitor--daemon` feature. Update references to `core.fsmonitor` and pointers to `watchman` to mention the fsmonitor daemon feature. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
94ef1bd to
e91ef3e
Compare
Create client routines to spawn a fsmonitor daemon and send it an IPC request using `simple-ipc`. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
The `core.fsmonitor` setting is supposed to be set to a path pointing to a script or executable that (via the Hook API) queries an fsmonitor process such as watchman. We are about to implement our own fsmonitor backend, and do not want to spawn hook processes just to query it. Lets use `Simple IPC` to directly communicate with the daemon (and start it if necessary). Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Create a built-in file system monitoring daemon that can be used by the existing `fsmonitor` feature (protocol API and index extension) to improve the performance of various Git commands, such as `status`. The `fsmonitor--daemon` feature builds upon the `Simple IPC` API and provides an alternative to hook access to existing fsmonitors such as `watchman`. This commit merely adds the new command without any functionality. Co-authored-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Implement command options `--stop`, `--is-running`, `--query`, `--query-index`, and `--flush` to control and query the status of a `fsmonitor--daemon` server process (and implicitly start a server process if necessary). Later commits will implement the actual server and monitor the file system. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Stub in empty backend for fsmonitor--daemon on Windows. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Stub in empty implementation of fsmonitor--daemon backend for MacOS. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Implement command options `--run` and `--start` to try to begin listening for file system events. This version defines the thread structure with a single fsmonitor_fs_listen thread to watch for file system events and a simple IPC thread pool to wait for connections from Git clients over a well-known named pipe or Unix domain socket. This version does not actually do anything yet because the backends are still just stubs. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Teach fsmonitor--daemon to classify relative and absolute pathnames and decide how they should be handled. This will be used by the platform-specific backend to respond to each filesystem event. When we register for filesystem notifications on a directory, we get events for everything (recursively) in the directory. We want to report to clients changes to tracked and untracked paths within the working directory. We do not want to report changes within the .git directory, for example. This classification will be used in a later commit by the different backends to classify paths as events are received. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Teach fsmonitor--daemon to create token-ids and define the overall token naming scheme. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Teach fsmonitor--daemon to build lists of changed paths and associate them with a token-id. This will be used by the platform-specific backends to accumulate changed paths in response to filesystem events. The platform-specific event loops receive batches containing one or more changed paths. Their fs listener thread will accumulate them in a `fsmonitor_batch` (and without locking) and then "publish" them to associate them with the current token and to make them visible to the client worker threads. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Teach the win32 backend to register a watch on the working tree root directory (recursively). Also watch the <gitdir> if it is not inside the working tree. And to collect path change notifications into batches and publish. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Include MacOS system declarations to allow us to use FSEvent and
CoreFoundation APIs. We need GCC and clang versions because of
compiler and header file conflicts.
While it is quite possible to #include Apple's CoreServices.h when
compiling C source code with clang, trying to build it with GCC
currently fails with this error:
In file included
from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/Security.framework/Headers/AuthSession.h:32,
from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/Security.framework/Headers/Security.h:42,
from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/CoreServices.framework/Frameworks/OSServices.framework/Headers/CSIdentity.h:43,
from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/CoreServices.framework/Frameworks/OSServices.framework/Headers/OSServices.h:29,
from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/CoreServices.framework/Frameworks/LaunchServices.framework/Headers/IconsCore.h:23,
from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/CoreServices.framework/Frameworks/LaunchServices.framework/Headers/LaunchServices.h:23,
from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/CoreServices.framework/Headers/CoreServices.h:45,
/Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/Security.framework/Headers/Authorization.h:193:7: error: variably modified 'bytes' at file scope
193 | char bytes[kAuthorizationExternalFormLength];
| ^~~~~
The underlying reason is that GCC (rightfully) objects that an `enum`
value such as `kAuthorizationExternalFormLength` is not a constant
(because it is not, the preprocessor has no knowledge of it, only the
actual C compiler does) and can therefore not be used to define the size
of a C array.
This is a known problem and tracked in GCC's bug tracker:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93082
In the meantime, let's not block things and go the slightly ugly route
of declaring/defining the FSEvents constants, data structures and
functions that we need, so that we can avoid above-mentioned issue.
Let's do this _only_ for GCC, though, so that the CI/PR builds (which
build both with clang and with GCC) can guarantee that we _are_ using
the correct data types.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Implement file system event listener on MacOS using FSEvent, CoreFoundation, and CoreServices. Co-authored-by: Kevin Willford <Kevin.Willford@microsoft.com> Co-authored-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Teach fsmonitor--daemon to respond to IPC requests from client Git processes and respond with a list of modified pathnames relative to the provided token. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Teach fsmonitor--daemon to periodically truncate the list of modified files to save some memory. Clients will ask for the set of changes relative to a token that they found in the FSMN index extension in the index. (This token is like a point in time, but different). Clients will then update the index to contain the response token (so that subsequent commands will be relative to this new token). Therefore, the daemon can gradually truncate the in-memory list of changed paths as they become obsolete (older that the previous token). Since we may have multiple clients making concurrent requests with a skew of tokens and clients may be racing to the talk to the daemon, we lazily truncate the list. We introduce a 5 minute delay and truncate batches 5 minutes after they are considered obsolete. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Define GIT_TEST_FSMONITOR_CLIENT_DELAY as a millisecond delay. Introduce an artificial delay when processing client requests. This make the CI/PR test suite a little more stable and avoids the need to load up test scripts with sleep statements to avoid racy failures. This was mostly seen on 1 or 2 core CI build machines where the test script would create a file and quickly try to confirm that the daemon had seen it *before* the daemon had received the kernel event and causing a test failure. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Teach fsmonitor--daemon client threads to create a cookie file inside the .git directory and then wait until FS events for the cookie are observed by the FS listener thread. This helps address the racy nature of file system events by blocking the client response until the kernel has drained any event backlog. This is especially important on MacOS where kernel events are only issued with a limited frequency. See the `latency` argument of `FSeventStreamCreate()`. The kernel only signals every `latency` seconds, but does not guarantee that the kernel queue is completely drained, so we may have to wait more than one interval. If we increase the frequency, the system is more likely to drop events. We avoid these issues by having each client thread create a unique cookie file and then wait until it is seen in the event stream. Co-authored-by: Kevin Willford <Kevin.Willford@microsoft.com> Co-authored-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Set the `FSMONITOR_CHANGED` bit on `istate->cache_changed` when the fsmonitor response contains a different token to ensure that the index is written to disk. Normally, when the fsmonitor response includes a tracked file, the index is always updated. Similarly, the index might be updated when the response alters the untracked-cache (when enabled). However, in cases where neither of those cause the index to be considered changed, the fsmonitor response is wasted. And subsequent commands will continue to make requests with the same token and if there have not been any changes in the working directory, they will receive the same response. This was observed on Windows after a large checkout. On Windows, the kernel emits events for the files that are changed as they are changed. However, it might delay events for the containing directories until the system is more idle (or someone scans the directory (so it seems)). The first status following a checkout would get the list of files. The subsequent status commands would get the list of directories as the events trickled out. But they would never catch up because the token was not advanced because the index wasn't updated. This list of directories caused `wt_status_collect_untracked()` to unnecessarily spend time actually scanning them during each command. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Repeat all of the fsmonitor perf tests using `git fsmonitor--daemon` and the "Simple IPC" interface. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
e91ef3e to
cbc3f7d
Compare
…es.scalar=experimental` This expands the `GitVersion` parser to include an `Extra` string. This will contain anything after `2.X.Y.vfs.Z.W`. For instance, the tag `v2.30.0.vfs.0.0.exp` contains the builtin FS Monitor feature (see microsoft/git#289). To enable experimental features, we check the new `feature.scalar` config setting. This has a few modes: 1. `feature.scalar=false` implies we want to disable optional features like FS Monitor. 2. `feature.scalar=true` or unset implies we want to use our default values, like using Watchman for FS Monitor. 3. `feature.scalar=experimental` means we should use available experimental features like FS Monitor. In the future, this could also include early versions of sparse-index. After using `git config` to set the appropriate `feature.scalar` value locally, users can run `scalar run config` to re-initialize features according to this recommendation. These instructions will be part of the experimental release documentation when we are closer to making that available. This is appropriate for full releases, as it will do nothing new if users don't have the experimental Git release as well.
This fixes a test failure in t0012-help.sh. Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
|
@jeffhostetler this can be closed now, right? |
|
yes, thanks! |
This PR merges [pre-RFC] simple-ipc and [WIP] fsmonitor3 onto vfs-2.28.0+jk/strvec for testing purposes.
This PR should replace #282