Skip to content

Sqlite placeholders#1140

Merged
kewillford merged 53 commits intomicrosoft:masterfrom
kewillford:sqlite_placeholders
May 30, 2019
Merged

Sqlite placeholders#1140
kewillford merged 53 commits intomicrosoft:masterfrom
kewillford:sqlite_placeholders

Conversation

@kewillford
Copy link
Member

@kewillford kewillford commented May 9, 2019

I think this is far enough along to start taking a look at. There is a lot of new code to handle the sqlite database but the other changes should be minimal.

I would like to get some feedback on how I am setting up the class for the connection and the class for the Placeholders table. Something to keep in mind is if we are going to add ModifiedPaths to this same database, does this make it easy or is there a better way?

Currently it is structured with a GVFSDatabase class that initializes the database and keeps a pool of connections for consumers to use. The connection is wrapped in a class that implements IDbConnection so that the Dispose method can return the underlying connection to the pool. Like the following:

using (IDbConnection pooled = this.database.GetPooledConnection())
using (SqliteCommand command = pooled.CreateCommand())
{

Another question is also how we show handle exceptions. The FileBasedCollection would have try catches and throw a custom FileBasedCollectionException. Should we be doing something similar for SQLite or just let exceptions bubble up?

Changed the Count property to be a method because it is now running a SQLite query and will take longer than returning a value and it being a method makes that more obvious.

Removed some constructors that caused me some confusion when I was adding the IPlaceholderDatabase as a parameter but I'm open to adding them back in if they make sense.

The most critical part to review would be the changes in GitIndexProjection.

TODO:

  • Add Upgrader
  • Error Handling
  • See what other possible tests I can add
  • Repair
  • Validate dehydrate works as expected
  • Validate diagnose works as expected
  • Add custom exception for sqlite database errors

@kewillford kewillford added the WIP label May 9, 2019
@kewillford kewillford self-assigned this May 9, 2019
@kewillford
Copy link
Member Author

/azp run PR - Windows - Functional Tests

@azure-pipelines
Copy link

Pull request contains merge conflicts.

@kewillford
Copy link
Member Author

/azp run PR - Windows - Extra

@azure-pipelines
Copy link

Pull request contains merge conflicts.

@kewillford kewillford force-pushed the sqlite_placeholders branch 2 times, most recently from becd8b0 to 0f9e40c Compare May 13, 2019 16:47
@kewillford
Copy link
Member Author

/azp run GitHub VFSForGit Mac Functional Tests

@azure-pipelines
Copy link

No pipelines are associated with this pull request.

@kewillford
Copy link
Member Author

/azp run GitHub VFSForGit Mac Functional Tests

@azure-pipelines
Copy link

No pipelines are associated with this pull request.

@kewillford kewillford added this to the M153 milestone May 15, 2019
@kewillford kewillford changed the title [WIP] Sqlite placeholders Sqlite placeholders May 15, 2019
@kewillford kewillford marked this pull request as ready for review May 15, 2019 19:13
@kewillford kewillford added affects: performance pr: refactoring Keeping the codebase healthy and removed WIP labels May 15, 2019
@wilbaker
Copy link
Member

Another question is also how we show handle exceptions. The FileBasedCollection would have try catches and throw a custom FileBasedCollectionException. Should we be doing something similar for SQLite or just let exceptions bubble up?

FWIW we went with having a custom exception type for the BlobSizes database (BlobSizesException) to make it easier for callers to know what exception type(s) to expect. The database code was able to map all possible exceptions to a single well know type.

Copy link
Member

@wilbaker wilbaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a first set of comments on the database code specifically. I still need to review the tests, and how the new databases integrate with the rest of the codebase.

command.Parameters.Add("@sha", SqliteType.Text).Value = placeholder.Sha;
}

command.ExecuteNonQuery();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tried measuring the performance impact of using prepared statements, and if so, what did you find?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did on the Contains method that I had and it did help. The issue here would be that you need to use the same connection that the prepare was ran against. This would require the class to keep connections around and do locking for using the commands that are prepared which would really add to the complexity so I will look into that after the PR is smoothed out.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also ran some testing using a simple locking strategy with one prepared connection and command and things were slowed down because of the lock. The next experiment will be to have a pool of prepared connections and commands to remove the need for the lock.

namespace GVFS.UnitTests.Common.Database
{
[TestFixture]
public class PlaceholdersTests
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class goes a little mock crazy so I'm not sure the value this brings vs the maintenance cost of all the mocking it is doing.

@kewillford
Copy link
Member Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 5 pipeline(s).

@kewillford
Copy link
Member Author

/azp run PR - Windows - Functional Tests

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@derrickstolee
Copy link
Contributor

@kewillford What's the status here. Are you still doing investigations? I think this is good to go, and it's probably best to let it simmer in CI before a release.

Cc: @jrbriggs, @jamill in case they know the reason.

@kewillford
Copy link
Member Author

@derrickstolee, @wilbaker was going to take another look today.

Copy link
Member

@wilbaker wilbaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Posting comments so far, I still need to review:

  • Test files
  • The rest of GVFSDatabase.cs
  • PlaceholderTable.cs

Copy link
Member

@wilbaker wilbaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still need to review the tests, but wanted to post the rest of my comments on the product changes first.

Copy link
Member

@wilbaker wilbaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding comments on the functional test changes, reviewing the unit tests now.

Copy link
Member

@wilbaker wilbaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few more comments on the Unit Tests

Copy link
Member

@wilbaker wilbaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved with a few more minor non-blocking comments (posted earlier today)

@turbonaitis
Copy link
Contributor

I'm a bit late to the party, but do we have some performance numbers here? How was the old placeholder DB performing and by how much the SQLite solution improves it?

@kewillford
Copy link
Member Author

@turbonaitis here are some number with ~1 million placeholders

Old

Load and get all placeholders: 41.253 s
Write and update placeholders: 21.940 s

SQLite

Load and get all placeholders: 2.053 s
Write and update placeholders: 23.940 s

@turbonaitis
Copy link
Contributor

turbonaitis commented Jul 15, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

affects: performance pr: refactoring Keeping the codebase healthy

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants