Skip to content

Remove .scalar directory from enlistment root#239

Merged
derrickstolee merged 11 commits intomicrosoft:masterfrom
derrickstolee:log-location
Nov 20, 2019
Merged

Remove .scalar directory from enlistment root#239
derrickstolee merged 11 commits intomicrosoft:masterfrom
derrickstolee:log-location

Conversation

@derrickstolee
Copy link
Contributor

@derrickstolee derrickstolee commented Nov 13, 2019

Removing the .scalar directory took a few steps, and each commit can be reviewed independently:

  1. Delete all references to mount-id.
  2. Move logs folder out of .scalar folder an into src/.git/logs.
  3. Move diagnostics folder out of .scalar/diagnostics and into .scalarDiagnostics.
  4. Remove log message from ShowStatusWhileRunning (the line that says "run scalar log for more info).
  5. Change the algorithm for finding the enlistment root.
  6. Delete unimportant references to DotScalar constants.
  7. Remove unnecessary references to RepoMetadata.
  8. Delete disk layout upgrades and RepoMetadata.
  9. Delete status cache tests.
  10. Delete all references to .scalar folder.

@derrickstolee derrickstolee added this to the MVP milestone Nov 13, 2019
@wilbaker
Copy link
Member

I saw this is draft so I haven't taken a detailed look through the change yet, but one high level question that came to mind:

With this change what is the best way to tell that a git repo is a Scalar repo (or is that no longer possible)? I'd hope our customers don't care (and don't want to check), but if we wanted to do something like add options for registering/unregistering a repo for maintenance would they be able to validate if they're running against a Scalar repo? Or would the maintenance tasks no-op if they were run against a repo that wasn't a Scalar repo (e.g. one that does use the MIDX)?

@derrickstolee derrickstolee reopened this Nov 13, 2019
@derrickstolee derrickstolee marked this pull request as ready for review November 13, 2019 19:01
@derrickstolee
Copy link
Contributor Author

With this change what is the best way to tell that a git repo is a Scalar repo (or is that no longer possible)? I'd hope our customers don't care (and don't want to check), but if we wanted to do something like add options for registering/unregistering a repo for maintenance would they be able to validate if they're running against a Scalar repo? Or would the maintenance tasks no-op if they were run against a repo that wasn't a Scalar repo (e.g. one that does use the MIDX)?

This is a very good question, and I think it really depends on how we implement the "Scalar watching a vanilla Git repo" feature. I'm hoping to dig into that feature around December when everyone else is on vacation.

In general, we will want to use our custom Git config values, for example:

  • core.useGvfsHelper is non-zero if we are using the GVFS protocol.
  • gvfs.sharedCache is set if we are using an alternate for packfile and commit-graph stuff.
  • We also have scalar.enlistment-id that is used for telemetry (and we should remove scalar.mount-id). We probably want to keep the enlistment id around even for vanilla Git repos.

@wilbaker
Copy link
Member

@derrickstolee thanks for the details!

We also have scalar.enlistment-id that is used for telemetry (and we should remove scalar.mount-id). We probably want to keep the enlistment id around even for vanilla Git repos.

I think this config option sounds like a good fit for what I was asking about.

If we reserve scalar.enlistment-id for repos cloned by scalar we could use it to differentiate repos in telemetry and use it in case one day we have a verb that cares.

Plus there would be no need to check that there isn't a .gvfs folder (something we might want to do if we look for the other options).

@derrickstolee
Copy link
Contributor Author

If we reserve scalar.enlistment-id for repos cloned by scalar we could use it to differentiate repos in telemetry and use it in case one day we have a verb that cares.

True. While I would like to have some notion of "enlistment id" for vanilla Git repos (think about supporting Edge/Chromium developers) we could change the name to git.enlistment-id if we need to.

Copy link
Member

@kewillford kewillford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Just a couple minor comments.

private const int ScalarGenericError = 3;

[TestCase]
public void CloneInsideExistingEnlistment()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we still want to make sure we can't scalar clone in an existing clone?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is something to watch for. But we need to reevaluate how to do that (or if we should).

I'm working to plan out what it would look like for Scalar to watch a vanilla Git repo, or a repo with submodules. I filed #241 to track this.

Copy link
Member

@wilbaker wilbaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I think these changes are looking good, just left a few comments.

public override bool TryGetScalarEnlistmentRoot(string directory, out string enlistmentRoot, out string errorMessage)
{
return WindowsPlatform.TryGetScalarEnlistmentRootImplementation(directory, out enlistmentRoot, out errorMessage);
return WindowsFileSystem.TryGetNormalizedPathImplementation(directory, out enlistmentRoot, out errorMessage);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than calling TryGetNormalizedPathImplementation here, should we something like one of the following instead:

  • Update TryGetScalarEnlistmentRootImplementation to search for src\.git instead of .scalar
  • Remove TryGetScalarEnlistmentRoot completely and update the callers to call TryGetNormalizedPath (confirming that behavior is correct for the caller).

My concern here that some callers of ScalarEnlistment.CreateFromDirectory might be depending on the old behavior of TryGetScalarEnlistmentRootImplementation walking up the directory tree (see its call to Paths.GetRoot). The change to call TryGetNormalizedPathImplementation is equivalent behavior.

Example: When something like scalar diagnose is run from a subfolder, EnlistmentRootPathParameter will be set to Environment.CurrentDirectory:

verb.EnlistmentRootPathParameter = Environment.CurrentDirectory;

And then that path will be passed to ScalarEnlistment.CreateFromDirectory:

ScalarEnlistment enlistment = this.CreateEnlistment(this.EnlistmentRootPathParameter, authentication);

And TryGetScalarEnlistmentRoot needs to walk up the directory to find the correct root.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some clarifications:

  • Update TryGetScalarEnlistmentRootImplementation to search for src\.git instead of .scalar
  • Remove TryGetScalarEnlistmentRoot completely and update the callers to call TryGetNormalizedPath (confirming that behavior is correct for the caller).

These recommendations make sense if we are going to preserve the behavior in the current version of the PR, but not preserve the behavior in master.

My concern here that some callers of ScalarEnlistment.CreateFromDirectory might be depending on the old behavior of TryGetScalarEnlistmentRootImplementation walking up the directory tree (see its call to Paths.GetRoot). The change to call TryGetNormalizedPathImplementation is equivalent behavior.

Example: When something like scalar diagnose is run from a subfolder, EnlistmentRootPathParameter will be set to Environment.CurrentDirectory:

verb.EnlistmentRootPathParameter = Environment.CurrentDirectory;

And then that path will be passed to ScalarEnlistment.CreateFromDirectory:

ScalarEnlistment enlistment = this.CreateEnlistment(this.EnlistmentRootPathParameter, authentication);

And TryGetScalarEnlistmentRoot needs to walk up the directory to find the correct root.

And I agree that I'm changing behavior in a bad way here. I'm not sure that using TryGetNormalizedPathImplementation() is enough, as it seems to have some VFS for Git leftovers (comments like "The folder might not be on disk yet...").

I'm going to instead do the following algorithm:

  1. On clone, set a scalar.enlistmentRoot config value in the Git config.
  2. To discover the enlistment root, we do the following:
    a. Check if the current directory has a src subfolder. If so, append that to the path.
    b. Run git config --local --get scalar.enlistmentRoot in the path to get the config value. Git can walk the paths for us.

I believe this direction is future-proof to allow removing the src folder at clone time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check if the current directory has a src subfolder. If so, append that to the path.

There are src folders all over in some repos so what will happen running something from a directory inside the repo that has a src folder?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The nice thing is that the Git command will walk up until seeing the .git folder to get the config option. We don't need to duplicate that option.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On clone, set a scalar.enlistmentRoot config value in the Git config.
To discover the enlistment root, we do the following:
a. Check if the current directory has a src subfolder. If so, append that to the path.
b. Run git config --local --get scalar.enlistmentRoot in the path to get the config value. Git can walk the paths for us.

Another alternative:

  1. Call git config --local --get scalar.enlistment-id to confirm repo is a Scalar repo
  2. The enlistment root is the grandparent of the .git directory (maybe get that with a call to git rev-parse --absolute-git-dir?)

I'm not sure if that approach would be simpler/better, but it would avoid the need to set another config.

I actually wonder if we can use git rev-parse --absolute-git-dir to remove the need for TryGetNormalizedPathImplementation, but that could be its own investigation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized the scenario my suggestion does not handle is checking while we are in the root itself (but perhaps we special case that by looking for a child src/.git directory)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wilbaker: I've updated the algorithm to use that search instead of the config option.

@derrickstolee derrickstolee changed the title Remove .scalar directory from enlistment root [WIP] Remove .scalar directory from enlistment root Nov 14, 2019
@derrickstolee derrickstolee added the disk-layout Issues that impact the disk layout used by Scalar label Nov 14, 2019
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
@derrickstolee derrickstolee changed the title [WIP] Remove .scalar directory from enlistment root Remove .scalar directory from enlistment root Nov 14, 2019
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Copy link
Member

@wilbaker wilbaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few more comments, overall the PR is looking good.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
@derrickstolee derrickstolee merged commit 0b1cae6 into microsoft:master Nov 20, 2019
derrickstolee added a commit that referenced this pull request Nov 22, 2019
@kewillford noticed that the perf builds started failing after #239 merged. Revert it for now, and I'll investigate tomorrow.
derrickstolee added a commit that referenced this pull request Nov 26, 2019
This replaces #239 which was reverted in #248.

The only change is the algorithm for finding the enlistment root. The previous algorithm had an issue when called from the `src` folder as it would create the enlistment root there and then the `ScalarEnlistment` class still automatically added `src` to that for the working directory root.

This was not caught by the functional tests because they run the maintenance verbs from the enlistment root, not the `src` folder. I didn't catch it in testing because my local testing was from the old mechanism using `git config` instead. The performance test suite does reveal this problem.

As a follow-up, we should remove the "working directory backing root" (#250).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cleanup disk-layout Issues that impact the disk layout used by Scalar

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants