Skip to content

[RFC DO NOT MERGE] internal fsmonitor#256

Closed
kewillford wants to merge 51 commits intovfs-2.26.2from
test/vfs-internal-fsmonitor-2.26.0
Closed

[RFC DO NOT MERGE] internal fsmonitor#256
kewillford wants to merge 51 commits intovfs-2.26.2from
test/vfs-internal-fsmonitor-2.26.0

Conversation

@kewillford
Copy link
Member

This is the code for an internal fsmonitor.

Copy link

@derrickstolee derrickstolee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few things that popped out to me during an initial scan.

if (!strcmp(command, "quit")) {
if (fsmonitor_listen_stop(state))
error("Could not terminate watcher thread");
sleep_millisec(50);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm...

@kewillford kewillford force-pushed the test/vfs-internal-fsmonitor-2.26.0 branch from 2e35505 to 5193d09 Compare April 24, 2020 15:19
@dscho dscho self-assigned this May 4, 2020
@dscho dscho force-pushed the test/vfs-internal-fsmonitor-2.26.0 branch from 5193d09 to b2fd129 Compare May 5, 2020 21:33
@dscho dscho changed the base branch from vfs-2.26.0 to vfs-2.26.2 May 5, 2020 21:47
@dscho dscho force-pushed the test/vfs-internal-fsmonitor-2.26.0 branch from b2fd129 to 9e9ce6b Compare May 5, 2020 21:47
@dscho dscho force-pushed the test/vfs-internal-fsmonitor-2.26.0 branch 6 times, most recently from 87a0366 to 2f4d904 Compare May 28, 2020 20:26
jeffhostetler and others added 18 commits May 29, 2020 17:20
Teach packet_write_fmt_1() to allocate and release 'buf' and eliminate
the need for a static buffer.  This helps get us ready for threaded
operations.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Teach packet_write_gently() to use a stack buffer rather than a static
buffer when composing the packet line message.  This helps get us ready
for threaded operations.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
So far, the (possibly indirect) callers of `get_packet_data()` can ask
that function to return an error instead of `die()`ing upon end-of-file.

However, when we call this function in a long-running daemon, we
absolutely want the daemon to live, still, even if there was a read
error when one random read failed.

So let's introduce an explicit option to tell the packet reader
machinery to please be nice and only return an error.

This change prepares pkt-line for the internal fsmonitor (which is
precisely such a daemon).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
…uf()

This function currently has only one caller: `apply_multi_file_filter()`
in `convert.c`. That caller wants a flush packet to be written after
writing the payload.

However, we are about to introduce a user that wants to write many
packets before a final flush packet, so let's extend this function to
prepare for that scenario.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Teach write_packetized_from_fd() to use a stack buffer rather
than a static buffer.  This helps get us ready for threaded
operations.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
The `read_packetized_to_strbuf()` function reads packets into a strbuf
until a flush packet has been received. So far, it has only one caller:
`apply_multi_file_filter()` in `convert.c`. This caller really only
needs the `PACKET_READ_GENTLE_ON_EOF` option to be passed to
`packet_read()` (which makes sense in the scenario where packets should
be read until a flush packet is received).

We are about to introduce a caller that wants to pass other options
through to `packet_read()`, so let's extend the function signature
accordingly.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Teach `packet_read_line_generic()` to take buffer argument rather than
assuming global `packet_buffer`.  Appended `_r` to function name to
make it clear the intent.

This helps us get ready for threaded or concurrent operations.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
For the upcoming `fsmonitor--daemon` command, we need some way to send a
command from some Git process to that daemon, and to receive an answer.
In other words, we need to support Inter-Process Communication.

By nature, Inter-Process Communication requires OS-specific
implementations.

Instead of using full-blown TCP/UDP socket communication (which comes
with all kinds of challenges such as: restricting access permissions,
collisions when multiple instances might want to listen on the same
port, etc), we use named pipes on Windows, and plan on implementing a
backend based on Unix sockets for Linux/Unix later.

For now, we use a super-simple protocol: identified by a path into
.git/, the daemon listens for clients, reads one NUL-terminated command,
then sends the answer and closes the connection.

Note: we do _not_ use the `pkt-line` protocol: while it is almost
libified, there is at least one code path starting from
`packet_read_line_gently()` that dies in `get_packet_data()`, and we
really, really do not want our simple IPC client to take down the entire
Git process with it when it could not talk to `fsmonitor--daemon`, but
would rather instead fall back to a full scan of all tracked files.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Create `packet_read_line_r()`.  This is a version of
`packet_read_line()` that takes a buffer argument rather than
assuming a static buffer.  This helps us get ready for threaded
or concurrent operations.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Create `packet_read_line_gently_r()`.  This is a version of
`packet_read_line_gently()` that takes a buffer argument rather than
assuming a static buffer.  This helps us get ready for threaded or
concurrent operations.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
This adds a very rudimentary test, just to make sure that the simple
Inter-Process Communication (where supported) works as intended (and
maybe also serving as an example how the command listener is supposed to
be implemented).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
prevent multiple listeners associated with the same path

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Kevin Willford <Kevin.Willford@microsoft.com>
The idea is to provide an efficient default implementation of the
`fsmonitor` hook for the major platforms, starting on the platform that
benefits from it most: Windows (because there is no native `stat()`, it
has to be emulated, which means that refreshing the Git index is a
rather hefty operation on that platform).

Originally, the intention of the `fsmonitor` extension was quite
obviously to use Facebook's `watchman`. However, it turned out that it
is just too hard to make it work reliably on Windows: none of the core
`watchman` developers has access to a Windows machine, and it is even
non-trivial to make the project build on Windows (which is not helped by
the hierarchy of build dependencies).

Besides: when it comes to Git's `fsmonitor` hook, `watchman` does both
too much (it allows pretty complicated queries, which we do not need in
the `fsmonitor` hook) and too little (`watchman` is totally unaware of
Git's internals, e.g. cannot use the `untracked` extension to initialize
its state efficiently upon start-up, therefore the first `git status` is
still slow, even if `watchman` is used via the `fsmonitor`).

Given all these considerations, it makes sense to "roll our own".

This commit merely adds the command without any functionality; Over the
course of the next commits we will implement it in bits and pieces.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This prepares the `pkt-line` machinery to allow multi-threaded use
(which we will need in the FSMonitor daemon because it wants to be able
to serve multiple clients at once, within the same process, in separate
threads).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This topic branch implements a framework for a simple Inter-Process
Communication API, using Unix sockets on Linux/Unix/macOS, and Named
Pipes on Windows.

This prepares for the upcoming patch series to add a built-in, Git-aware
FSMonitor.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
dscho and others added 25 commits May 31, 2020 22:37
TODO: test!

It is quite common that the internal fsmonitor receives multiple events
for the same path. There is no need for the fsmonitor to list the path
multiple times, though, as Git just wants to verify via an `lstat()`
whether the file changed or not.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Those paths are not for reportin' ;-)

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
TODO: add a test case

The `core.fsmonitor` setting is supposed to be set to a path pointing to
a script or executable that queries an fsmonitor such as watchman.

We are about to implement our own backend, and do not want to spawn an
extra process just to query it, but want to use Inter-Process
Communication to talk to the daemon (and start it if necessary).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Each worktree has a `.git` in its top-level directory, whether it is a
directory or a file. Once that directory or file is gone, it stops being
a Git worktree. So it is pointless to keep listening in that case.

This also helps the situation on Windows when a user wants to remove a
Git worktree altogether: a running daemon would prevent the deletion of
the worktree because its the latter is the former's working directory
(plus, it has to keep an open handle to that directory for that
`ReadDirectoryChangesW()` function to do its job).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
In this topic branch, we start implementing a built-in, Git-aware
FSMonitor.

Thus far, only a Windows-specific backend is implemented, and it will
have to be configured via setting `core.fsmonitor` to the full path of
the hard-linked `git-fsmonitor--daemon` executable in
`libexec/git-core/`; Both of these limitations will be lifted in later
patch series.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
These patches are not strictly necessary for the internal FSMonitor to
work correctly, but improve its functionality and speed.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Instead of setting `core.fsmonitor` to the absolute path of the
hard-linked built-in called `git-fsmonitor--daemon`, we teach Git to
accept the special value `:internal:`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
…Events

Signed-off-by: Kevin Willford <Kevin.Willford@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Kevin Willford <Kevin.Willford@microsoft.com>
…events

TODO: seq needs to be incremented under lock; see whether the code can
be simplified

Because file system events could come in before the query but not
processed yet, the code needs to write a cookie file and only return
the results of the query once the cookie file event has been seen so
that this will guarantee that the query will see all events that happened
before the query was sent.

With this change, we can finally guarantee that the clock ID that the
FSMonitor query reports is non-zero (previously, it could still be
initialized to zero because the first event might not yet have been
seen).

Signed-off-by: Kevin Willford <Kevin.Willford@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Kevin Willford <Kevin.Willford@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
While it is quite possible to #include Apple's CoreServices.h when
compiling C source code with clang, trying to build it with GCC
currently fails with this error:

In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/Security.framework/Headers/AuthSession.h:32,
                 from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/Security.framework/Headers/Security.h:42,
                 from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/CoreServices.framework/Frameworks/OSServices.framework/Headers/CSIdentity.h:43,
                 from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/CoreServices.framework/Frameworks/OSServices.framework/Headers/OSServices.h:29,
                 from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/CoreServices.framework/Frameworks/LaunchServices.framework/Headers/IconsCore.h:23,
                 from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/CoreServices.framework/Frameworks/LaunchServices.framework/Headers/LaunchServices.h:23,
                 from /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/CoreServices.framework/Headers/CoreServices.h:45,
 /Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/System/Library/Frameworks/Security.framework/Headers/Authorization.h:193:7: error: variably modified 'bytes' at file scope
   193 | char bytes[kAuthorizationExternalFormLength];
       |      ^~~~~

The underlying reason is that GCC (rightfully) objects that an `enum`
value such as `kAuthorizationExternalFormLength` is not a constant
(because it is not, the preprocessor has no knowledge of it, only the
actual C compiler does) and can therefore not be used to define the size
of a C array.

This is a known problem and tracked in GCC's bug tracker:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93082

In the meantime, let's not block things and go the slightly ugly route
of declaring/defining the FSEvents constants, data structures and
functions that we need, so that we can avoid above-mentioned issue.

Let's do this _only_ for GCC, though, so that the CI/PR builds (which
build both with clang and with GCC) can guarantee that we _are_ using
the correct data types.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
TODO: we should be able to figure this out on the read-cache.c side
_without_ requiring a trailing `/`.

Signed-off-by: Kevin Willford <Kevin.Willford@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
In this topic branch, we add a macOS-specific backend to the internal
FSMonitor daemon.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
In particular on macOS, our tests indicate that the events might arrive
with a noticable lag. To make sure that we consumed all events we
intended to receive, we now write a "cookie file", i.e. a temporary file
whose only purpose is to trigger a corresponding event; once the event
arrived, we know we're all caught up.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This adds a Linux-specific backend to the internal FSMonitor.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Pseudo topic branch, to keep track of the outstanding TODOs.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
We only need to test the simple IPC and the fsmonitor feature, and not
even everywhere: the two Windows builds, both macOS and one Linux build
are enough to build confidence.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The previous commit is for debugging macOS issues only, and should only
be used if absolutely necessary because it is actually quite awkward to
abuse GitHub Actions' macOS agents for interactive debugging.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
@dscho dscho force-pushed the test/vfs-internal-fsmonitor-2.26.0 branch from 83555be to e231804 Compare May 31, 2020 20:53
@dscho dscho mentioned this pull request Jun 1, 2020
@dscho
Copy link
Member

dscho commented Jun 1, 2020

Closing, in favor of #273.

@dscho dscho closed this Jun 1, 2020
@dscho dscho deleted the test/vfs-internal-fsmonitor-2.26.0 branch June 1, 2020 20:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants