| description | Instructions for creating a git mirror of the SQLite sources |
| owner | mackyle@gmail.com |
| last change | Wed, 2 Apr 2025 06:28:51 +0000 (1 23:28 -0700) |
| URL | git://repo.or.cz/sqlite-export.git |
| https://repo.or.cz/sqlite-export.git | |
| push URL | ssh://repo.or.cz/sqlite-export.git |
| https://repo.or.cz/sqlite-export.git (learn more) | |
| bundle info | sqlite-export.git downloadable bundles |
| content tags |
As of the SQLite 3.28.0 release there now exists an official Git mirror of the SQLite software at <https://github.com/sqlite/sqlite>.
As of the Fossil 2.9 release on 2019-07-13, a new fossil git
export command provides the ability to export a Fossil repository
to git.
This project continues to provide an alternative mechanism to export
a fossil project to Git. The new fossil 2.9 and later fossil git export
command produces a git repository that more-or-less matches one
created by using this project's --trailer and --manifest options
but without providing any mapping of user IDs to real user names.
| Method | User Names | Options | Repository |
|---|---|---|---|
| fossil 2.9 | left as ID | N/A | FossilOrigin-Name trailers and manifests |
| sqlite-export | mapped | <none> | No notes, trailer lines or manifest files |
| sqlite-export | mapped | --notes | refs/notes/fossil records fossil check-in |
| sqlite-export | mapped | --trailer | FossilOrigin-Name trailer line added |
| sqlite-export | mapped | --manifest | manifest and manifest.uuid files added |
In fact, any of the three options (--notes, --trailer, --manifest)
may be used with this sqlite-export project in any combination to
produce the desired output repository with whatever "extras" are
desired or not.
| Repository | Producer | Extras |
|---|---|---|
| $GH/sqlite/sqlite | fossil 2.9+ | no user mapping, always trailers and manifests |
| $repo/sqlite | sqlite-export | users mapped, fossil origin in refs/notes/fossil |
| $repo/sqlite-manifest | sqlite-export | users mapped, trailers and manifests (no notes) |
| $GH/mackyle/sqlite | sqlite-export | mirror of $repo/sqlite |
The $repo/sqlite repository and its $GH/mackyle/sqlite
mirror are maintained using this project and the --notes option.
The $repo/sqlite-manifest repository is maintained using this
project and both the --trailer and --manifest options but not
the --notes option and should be substantially similar to the
official git mirror of SQLite ($GH/sqlite/sqlite) except that
user ids have been mapped to user names.
Reminder
Fossil version 2.9 and later directly supports exporting the SQLite fossil repository to git. There's no reason to use this project if that export process meets your needs. See the fossil 2.9 release notes for details.
Theoretically exporting with fossil prior to version 2.9 is as simple as:
fossil export --git | git fast-import
Unfortunately it doesn't work that way and that's what this project is all about.
Note
Run the
buildscript with the-hoption to see some examples of possible arguments. Any arguments passed to thebuildscript are passed along to the fossilconfigurescript during the build process. Most systems will not require any arguments be passed to thebuildscript.
Run the script build to fetch and build a suitable fossil tool and a
git-export-filter tool.
Run the script import (maybe with the --notes and/or other options)
to create an sqlite.git Git clone of the <https://sqlite.org/src> fossil
sources. (May take up to 60 minutes.)
See the "Building" section at the bottom of this README to "make" SQLite.
There are two problems with fossil export:
fossil versions starting with 1.18 mangle export branch and tag names to
avoid including characters git does not allow. The problem is that many
more characters are mangled than needed so that a tag like version-1.18
is converted to version_1_18 unnecessarily.
fossil versions after 1.18 produce a Git fast-import data stream that
causes git fast-import to fail with a fatal error.
The fossil change that introduces tag mangling is here:
It was a well-intentioned change as previously invalid Git names would be exported, but it went way, way too far. In fact, the actual Git rules about allowable characters in names are:
A patch is included in the file patches/export_c_patch_diff.txt that allows
the full diversity of git names to be used and should be applied to the fossil
src/export.c file of fossil version 2.1 before building fossil. It also adds
an optional --notes option to the fossil export --git command that if given
will add a note in the refs/notes/fossil namespace to each commit giving the
original fossil check-in hash for the commit. Furthermore, it also provides a
new --use-done-feature option (see git help fast-import) and makes sure there
aren't any whitespace issues with commit messages by transforming CRLF into
just LF and making sure the only whitespace at the end is a single LF.
There may be updates coming to the official fossil release to address this name mangling problem, but as of fossil 2.1 they have yet to make it into any official fossil release.
The fossil change that introduces the export problem is here:
There is even a ticket about this "timewarp" export issue here:
This issue affects the sqlite, sqlite_docsrc and fossil repositories
making it impossible to export them from fossil and import them into Git with
a current version of fossil.
The fossil ticket linked to in the above "The Export Problem" section talks about "timewarps". These are simply check-ins with a timestamp that is earlier than at least one of their parents (merges have two parents, most others one).
Fossil doesn't much like these. The Git fast-import format is a "streamy" format that, while it allows back references to things earlier in the stream, does not allow forward references to future, prospective data. Fossil likes to output its fast-import stream in check-in date order. And there you see the issue. If a "timewarp" is present then children get put out before their parents arrive, and Git rudely ends the fast-import operation when this occurs.
All three of the primary fossil repositories (SQLite, SQLite Docs, Fossil) have at least one "timewarp" in them.
Fossil versions 1.18 and earlier produce a usable fast-import stream not because it orders the output check-ins correctly in spite of the "timewarp", but because it outputs all data for each check-in rather than outputting only differences from the parent(s). So while the output isn't really correct, it is accepted by Git and when outside the "timewarp" portion of the history, the converted Git commits have exactly the correct set of sources, so it's really not much more than a minor annoyance when reviewing very tiny parts of older history in the repository.
Starting with fossil version 1.19 this all changed. Now, whenever possible, the exported Git fast-import stream only includes "changes" from a check-in's parent(s). With a sloppy ordering based only on check-in timestamp and in the presence of "timewarps", children get put out before their parent(s) arrive with the ensuing Git rudeness. While, on the surface, this seems like a good change (and it brought the ability to do incremental exports), full exports seem to take somewhat longer overall now.
Then on 2017-02-23, they "shattered it" <https://shattered.it/>.
Shortly thereafter fossil version 2.0 came out supporting additional hash functions. And on 2017-03-12 the official SQLite fossil repository got its first check-in using the new hash function. Versions of fossil prior to 2.0 cannot deal with these new hash function values.
Now you see the problem. Fossil version 1.18 can no longer be used (even with its technically incorrect output) as it cannot understand the new hash values. But fossil versions 1.19 and later (including 2.0) cannot be used either since they produce a completely unacceptable fast-import stream in the presence of any "timewarps".
But, curiosity is a harsh mistress. The topological ordering problem was
solved even for fossil 1.18 in a satisfactory way some time ago but never
published to avoid causing all the Git refs values to be force-updated.
Correcting the misordering caused by the "timewarps" alters the DAG (directed
acyclic graph) of check-in ancestry and that trickles down to all the children
causing all of their commit hash values to change even though the sources they
refer to remain completely unchanged.
As of 2017-03-12 there really isn't a choice anymore.
A GPL version 2 (or later) patch is included to address this in the file
patches/export_topo_patch_diff.txt that provides a guaranteed topological
ordering to the exported fast-import stream. When it's built into fossil,
that version of fossil becomes also covered by the GPL. The repository data
fossil maintains is unaffected by fossil's license(s) so having a GPL-covered
fossil binary should not really affect anyone.
A .tar.gz archive of the fossil 2.1 sources may be fetched from:
<https://fossil-scm.org/index.html/uv/fossil-src-2.1.tar.gz>
The downloaded .tar.gz file should have these size and hash values:
size: 4802504 bytes
md5: 9f32b23cecb092d42cdf11bf003ebf8d
sha1: 7c7387efb4c0016de6e836dba6f7842246825678
sha256: 85dcdf10d0f1be41eef53839c6faaa73d2498a9a140a89327cfb092f23cfef05
The archives subdirectory contains a copy of this .tar.gz file and it will
be used by the build script to create a fossil executable that reports its
version as 2.1+export to confirm that it contains the export fixes.
The Git fast-import facility does not provide a means to filter the incoming data stream to adjust user names (fossil export data only includes the user login name as the email address) nor a means to adjust branch/tag names (fossil exports a 'trunk' branch where Git expects a 'master' branch and fossil also exports what are essentially lightweight tags as annotated tags).
To deal with these issues, the git-export-filter utility is used.
It can be found at:
The included sqlite_authors file is used with the git-export-filter tool to
supply real user names and email addresses. Also note that the sqlite_authors
file also works for the <https://sqlite.org/docsrc> fossil repository as well.
After building a patched version of fossil 2.1 as described above and the
git-export-filter utility, a Git repository of the SQLite sources can be
created like so (which is what the import --notes script does):
fossil clone https://sqlite.org/src sqlite.fsl git --git-dir=sqlite.git init --bare fossil export --git --notes sqlite.fsl | git-export-filter --authors-file sqlite_authors --require-authors \ --trunk-is-master --convert-tagger tagger | git --git-dir=sqlite.git fast-import
The above will create the sqlite.git Git repository that is a clone of the
SQLite sources from the SQLite fossil respository <https://sqlite.org/src>
(note that only sources are cloned, not tickets or wiki pages or events).
The provided build script will attempt to download the necessary sources,
patch them and build suitable fossil and git-export-filter executable files.
It will pass along any arguments directly to the fossil configure script.
Run the build script with the -h option for examples (most systems will not
require any arguments be passed to the build script).
The provided import script will then attempt to clone the SQLite sources
and convert them into an sqlite.git repository. It may be run again to update
the sqlite.git repository with new changes. It accepts the --notes option
(which is recommended) to enable generation of the refs/notes/fossil notes
containing the original fossil check-in hash. It also accepts the --trailer
and --manifest options which may be used in any combination with or without
the --notes option.
The initial run of the import script may take up to 60 minutes on a fast
machine, and subsequent runs of import even on a fast machine will still,
unfortunately, take some time. The CPU will be pounded in either case.
IMPORTANT
Options passed to the import script are not remembered, so make sure to
pass the same options, (e.g. --notes) to the import script every time it's
run if it's being used to update a previously exported Git repository or you
may end up with out-of-date notes and/or mismatched trailer/manifest commits.
There are new options provided by the patch files for the fossil export
command. As a convenience, they may be given to the import script which
will just pass them on to the fossil export command.
--notes
Included with the export tags patch a new fossil export option
--notesis provided that adds a Git commit note to therefs/notes/fossilnamespace which contains the original fossil check-in hash for each fossil checkin exported to Git. Usegit log --notes=fossilto see these notes.
--trailer
Included with the export tags patch a new fossil export option
--traileris provided that adds a "FossilOrigin-Name:" trailer line to each commit created in the git repository that includes the original fossil check-in hash for that commit.
--manifest
Included with the export tags patch a new fossil export option
--manifestis provided that causes every commit created in the git repository to include amanifestandmanifest.uuidfile. Use of this option will increase the size of the generated git repository by approximately 25%.
--use-done-feature
Included with the export tags patch a new fossil export option
--use-done-featureis provided that includes thefeature doneanddonecommands at the beginning and end respectively of the exported fast-import stream. This can help avoid partial imports. See thegit help fast-importdescription of the--doneoption and thegit help fast-exportdescription of the--use-done-featureoption.
QUICKLY
Clone/checkout the new sqlite.git repository into a new working tree
Run the create-fossil-manifest script from this repository with the
current working directory set to the new working tree created in (1)
Now run the configure script in the new working tree created in (1)
Now run make in the new working tree created in (1)
DETAILS
Ideally, simply cloning from the new sqlite.git repository would allow one to
then build SQLite by simply using make (or configure and make).
Unfortunately, this is not the case, the make will fail with a message about
no rule to make the files manifest and/or manifest.uuid unless the
--manifest option was passed to the import script.
Both the SQLite sources and the Fossil sources require two fossil vcs specific
files to be created (manifest and manifest.uuid) in order for make to be
successful. When the --manifest option is passed to the import script
these files are added to every commit in the generated git repository which
increases the repository size by roughly 25%.
The manifest.uuid file simply contains the hash of the current checkout
and while a real manifest file contains a bunch of information, the only
thing that need be present is a line containing the UTC ISO date preceded
by 'D '.
The create-fossil-manifest script takes care of creating these files and
should be run with the current working directory set to the top-level of the
git clone's working directory if the --manifest option was NOT passed to
the import script.
Any time the HEAD commit changes, the create-fossil-manifest script should
be run to update the manifest and manifest.uuid files (only if the
--manifest option was NOT passed to the import script) before next
running make or the output of the sqlite_source_id() function will be
incorrect.
| 8 years ago | v1.18.fix+ | fossil version 1.18.fix+ | tag | commitlog |
| 11 months ago | master | logtree |