Skip to content

Conversation

@bohwaz
Copy link
Contributor

@bohwaz bohwaz commented Oct 18, 2018

See https://www.sqlite.org/backup.html for details

Usage is: SQLite3::backup(&destinationDb[, source_dbname = "main"[, destination_dbname = "main"]])

Note that this method will only be available with SQLite3 >= 3.6.11 released 2009-02-18

$db = new SQLite3('a.db');
$db2 = new SQLite3('b.db');

$db->backup($db2);

You can specify the name of the source and destination database (in case you have multiple opened database using ATTACH). "main" being the default database and "temp" being the temporary database (different than the ":memory:" database, see https://www.sqlite.org/inmemorydb.html last part for details on temporary database.

$db2->exec("ATTACH 'backup.db' AS 'backup';");
$db->backup($db2, "main", "backup");

@php-pulls
Copy link

Comment on behalf of petk at php.net:

Labelling... Thank you @bohwaz for the pull request

@cmb69
Copy link
Member

cmb69 commented Oct 18, 2018

Thanks! Very interesting. Have you considered to implement the full Sqlite3 backup API? See also https://bugs.php.net/bug.php?id=70950.

Is there a particular reason why $destinationDb is passed as reference?

Anyhow, I believe there are a lot of design alternatives, which might deserve further discussion on the internals@ mailing list.

@bohwaz
Copy link
Contributor Author

bohwaz commented Oct 18, 2018

That's a good question, I actually just followed the example from the SQLite3 doc.

So I did a short benchmark: creating a DB with a single table and filling it with 100K random images between 4KB to 100KB (making a 1.6 GB file).

  • Creating original.db using a PHP script: ~28 seconds
  • Copying original.db to copy.db using cp: ~47 seconds
  • Backuping original.db to copy.db using SQLite3::backup and a page count of -1 in sqlite3_step (meaning copy all pages): ~26 seconds
  • Backuping original.db to copy.db using SQLite3::backup and a page count of 5 in sqlite3_step: more than 5 minutes
  • Backuping original.db to copy.db using SQLite3::backup and a page count of 500 in sqlite3_step: more than 5 minutes

In all SQLite3 cases the memory use was very low and about the same.

So I don't really see a point in exposing the full API as it doesn't seem to offer anything over just exposing a simple method like this. But I'm open to suggestions, but it would make the use of that feature much more complicated.

I added two commits, one to change the number of pages in sqlite3_backup_step to -1 and one to remove the reference (yeah it's not required).

@cmb69
Copy link
Member

cmb69 commented Oct 19, 2018

So I don't really see a point in exposing the full API as it doesn't seem to offer anything over just exposing a simple method like this.

I agree that this would be drastically more complex, and if anybody wants the full API (for instance, to implement a progress bar), we could still consider to add it. However, it might be prudent to add an optional page count parameter. -1 is certainly the fastest way to create a backup, but the database is locked for writing during the backup; a cron job, for instance, might prefer to use something else than -1.

@bohwaz
Copy link
Contributor Author

bohwaz commented Oct 20, 2018

Good point, but then maybe we should also expose the duration of the sleep between calls to sqlite3_backup_step? As this is what may allow other processes to write to the database during the backup.

One issue as noted in the SQLite doc is that writes to the DB during the sleep will trigger a restart of the backup process. Meaning that in the right (or wrong) conditions of constant writes, the backup process may never end and the function never return. So maybe we should also add a timeout parameter?

@cmb69
Copy link
Member

cmb69 commented Oct 20, 2018

Oh, you're right! Five optional parameters would make a terrible API, though, and using a single options array is only slightly better (results in quite some complexity in the implementation). Maybe it's best to keep the method simple as it is, and to keep everything else for a future possible full fledged userland backup API.

@bohwaz
Copy link
Contributor Author

bohwaz commented Oct 20, 2018

Yeah I think that 99% of cases just using -1 is good enough. Maybe when we get named parameters in PHP? Or implementing the full API (like eg. SQLite3Backup::init, ::step, ::finish etc.) but it seems quite cumbersome to use for those 99% of cases.

@bohwaz
Copy link
Contributor Author

bohwaz commented Oct 24, 2018

The AppVeyor build fails for something unrelated to my commit

@cmb69
Copy link
Member

cmb69 commented Oct 24, 2018

Any objections to merge this into master?

Copy link
Member

@cmb69 cmb69 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I didn't closely review this PR until now, and it seems to me a few issues should be addressed,

@bohwaz
Copy link
Contributor Author

bohwaz commented Nov 14, 2018

I also thought of a possible issue: if one of the databases is locked and never released, the backup method will wait indefinitely (as SQLite will return SQLITE_BUSY all the time). So we should limit the waiting time to avoid that. The idea would be to add a timeout in the loop like this:

			if (rc == SQLITE_OK || rc == SQLITE_BUSY || rc == SQLITE_LOCKED) {
				if (time_spent >= timeout) {
					php_sqlite3_error(source_obj, "Backup failed as it reached the maximum execution time of %d ms", timeout);
					RETURN_FALSE;
				}

				time_spent += sqlite3_sleep(250);
			}

There are three ways to get the timeout number:

  1. Supply it as an argument to the method
  2. Set a timeout on the DB object when calling ::busyTimeout and re-use it (but busyTimeout defaults to zero currently, which means never wait and always fail if it's busy, so we might also want to change that default to something like 10 seconds as it makes more sense)
  3. Fetch it from SQLite itself with a PRAGMA busy_timeout;

I would argue that the second option is best, as it keeps the API simple.

@bohwaz
Copy link
Contributor Author

bohwaz commented Nov 15, 2018

AppVeyor is still failing for something completely unrelated :(

@cmb69
Copy link
Member

cmb69 commented Nov 15, 2018

AppVeyor is still failing for something completely unrelated :(

Already reported on internals-win@. Anatol is working on a fix.

@bohwaz
Copy link
Contributor Author

bohwaz commented Nov 15, 2018

AppVeyor is still failing for something completely unrelated :(

Already reported on internals-win@. Anatol is working on a fix.

Awesome, thanks!

@cmb69
Copy link
Member

cmb69 commented Nov 30, 2018

if one of the databases is locked and never released, the backup method will wait indefinitely

Good catch!

Set a timeout on the DB object when calling ::busyTimeout and re-use it

As I understand it, if sqlite3_backup_step() fails to establish the lock, SQLite3 already sleeps for the time specified in ::busyTimeout(), and only afterwards returns SQLITE_BUSY. So we can't use the busyTimeout value for a general timeout. Also, we should probably not call sqlite3_sleep(), if sqlite3_backup_step() returns SQLITE_BUSY, but rather rely on users to set an appropriate ::busyTimeout.

Now I wonder about SQLITE_OK; it seems this can never happen, since we're passing a negative nPage to sqlite3_backup_step. Also I wonder about SQLITE_LOCKED; the docs state:

If the source database connection is being used to write to the source database when sqlite3_backup_step() is called, then SQLITE_LOCKED is returned immediately.

Could this ever happen for ext/sqlite3 (unless pthreads or such is used)? If not, it seems to me that we wouldn't have to call sqlite3_sleep() at all.

Anyhow, I think we should add the overall timeout as parameter to SQLite3::backup(). Yet another alternative might be to call a user function whenever the database is busy. Or should we reconsider making an SQLiteBackup class available to userland?

@cmb69
Copy link
Member

cmb69 commented Dec 16, 2018

It just occured to me that having a look at the implementation of the backup method of Tcl sqlite objects could be reasonable. The actual backup code is astonishingly simple:

    pBackup = sqlite3_backup_init(pDest, "main", pDb->db, zSrcDb);
    /* handle initialization error */
    while(  (rc = sqlite3_backup_step(pBackup,100))==SQLITE_OK ){}
    sqlite3_backup_finish(pBackup);
    /* handle backup error */

Since tclsqlite.c is part of the SQLite repo, it might be just the right solution for us as well.

Note that the signature of the Tcl method corresponds to

proto bool SQLite3::backup(string backup_filename [, string source_database = "main"])

which might be a reasonable further simplification.

@bohwaz
Copy link
Contributor Author

bohwaz commented Dec 17, 2018

Indeed it seems like it is an easier implementation :)

As for the API, it would be easier to just have to supply the target filename, but then if you wanted to use the backup method to merge an existing database into another one and also query this second database you would need to call ::backup and then open the database to query it. But it would make things easier for 90% of use cases I guess, so… you've got a point. Also changing it to use a filename would prevent copying an existing file database to a new in-memory DB, which might be a useful thing.

I will change my patch and test it to see how it behaves without the sleep part.

@bohwaz bohwaz force-pushed the patch/sqlite3-backup branch from ade0247 to 7ec8469 Compare March 7, 2019 13:40
@bohwaz
Copy link
Contributor Author

bohwaz commented Mar 7, 2019

OK I updated the code, and now it will just fail if the source database is busy or locked.

@bohwaz
Copy link
Contributor Author

bohwaz commented Apr 4, 2019

Hi @cmb69 if you have some time can you tell me if it's fine for you? cheers :)

@bohwaz
Copy link
Contributor Author

bohwaz commented Jun 17, 2019

I think I resolved all the requested changes, but it still shows as "1 review requesting changes". Does anyone have a bit of time so that we can get that merged for PHP 7.4? Thanks :)

@krakjoe
Copy link
Member

krakjoe commented Jun 17, 2019

@cmb69 since you're familiar with this already, could you find a little time to wrap it up please ?

@bohwaz
Copy link
Contributor Author

bohwaz commented Jun 17, 2019

@cmb69 since you're familiar with this already, could you find a little time to wrap it up please ?

I already reached to him and unfortunately he seems quite busy, so I don't want to pressure him too much either, if someone else can take care of it :)

@cmb69
Copy link
Member

cmb69 commented Jun 17, 2019

I'll check it out later today. :)

@cmb69
Copy link
Member

cmb69 commented Jun 17, 2019

Thank you very much @bohwaz. Applied as ce22ccc.

@cmb69 cmb69 closed this Jun 17, 2019
@bohwaz
Copy link
Contributor Author

bohwaz commented Jun 17, 2019

Awesome, thank you very much @cmb69 :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants