Skip to content

Feature/sid migration#859

Merged
dpetran merged 15 commits intomainfrom
feature/sid-migration
Aug 14, 2024
Merged

Feature/sid migration#859
dpetran merged 15 commits intomainfrom
feature/sid-migration

Conversation

@dpetran
Copy link
Contributor

@dpetran dpetran commented Aug 7, 2024

We've changed several things since v3, and this is a tool to bring existing v3 ledgers up to compatibility with main.

When this migration is run, it will:

  1. generate a nameserver record for the ledger head
  2. create a new genesis commit for the ledger
  3. process each commit in order in order to:
    • generate index nodes that have correctly serialized SIDs
    • update commit data with new namespace codes
    • remove empty string author entries
    • correctly link to the previous commit, including the new genesis commit

Afterwards the old files will be left behind in the <ledger>/<branch> directory, which can be safely delete manually once everything is confirmed to be working correctly. We may write an additional migration in the future to delete the legacy data directory.

@dpetran dpetran requested a review from a team August 7, 2024 19:57
@dpetran
Copy link
Contributor Author

dpetran commented Aug 7, 2024

This has only been tested using a filesystem nameserver, there may be some kinks to work out with other nameserver methods.

@dpetran dpetran force-pushed the feature/sid-migration branch from 3f4ed7f to 3e48712 Compare August 9, 2024 15:43
@dpetran dpetran marked this pull request as draft August 9, 2024 20:50
@dpetran dpetran marked this pull request as ready for review August 9, 2024 22:18
dpetran added 10 commits August 10, 2024 00:33
I made a ledger using db at rev a2c7903, just before we changed the SID encoding. I
then tried to read its head commit using main HEAD, and it didn't work. These are the
minimal changes necessary to support lookup on that style of legacy ledger.
This introduces a private opt :time for jld-ledger/commit!. If you call commit! via the
iCommit protocol the :time opt will be dropped and generated, making it unavailable to
users of the public api.

The commit time is now generated in `enrich-commit-opts` if it is not supplied via
internal api.

I also restructured the enriched commit-opts to make it more clear what options are used
where.
For some reason we were generating this twice, once in jld-ledger/commit! and then again
when we were creating the base commit. This removes the redundant one.
When migrating a ledger that doesn't have a genesis commit, we create the genesis
commit. Unfortunately that hardcodes the genesis commit's time to the moment when the
migration was run, which would be newer than the rest of the commits on the ledger.

In order to maintain consistent commit times, this adds the ability to backdate the
genesis commit to the time of the original first commit. It is done via an internal-only
opt to jld-ledger/create*. If the opt is passed in via the public api it will be removed
by `parse-ledger-options`.
This is what users will actually have in hand, according to at least one user.
This corresponds with updated logging in server to make tracking migration progress
easier.
@dpetran dpetran force-pushed the feature/sid-migration branch 2 times, most recently from 7f114c6 to 6c80802 Compare August 12, 2024 22:06
In order to tell the difference between ledgers that have run the sid migration and
those that have not, this bumps the commit version.

These are the changes the new version number represents:
- omit f:author key when transaction is unsigned
- include the f:namespaces key in the data file when new namespace codes are generated
- update index nodes to encode identifiers as SIDs
- use the new ledger nameserver record
- use new syntax to address a ledger
Unless it is overridden by the user supplying the `force` argument, skip migrating a
ledger if the commit version indicates it has already been migrated.
Copy link
Contributor

@zonotope zonotope left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✍🏾

[ledger branch tuples-chans]
(go-try
(loop [[[commit-tuple ch] & r] tuples-chans
db (<? (async-db/deref-async (jld-ledger/current-db ledger)))]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's possible that current db is a FlakeDB. I think you're going to have to check what kind of db it is before trying to deref it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ledger is a brand new one returned from a call to jld-ledger/create* with no data transacted into it, so I think it will always be an AsyncDB the first time you call current-db on it, but I'll add a guard just in case.

[fluree.db.async-db :as async-db]
[fluree.db.connection :as connection]
[fluree.db.constants :as const]
[fluree.db.flake.flake-db :as db]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this alias should be flake-db so it's more clear what the full namespace is when it's used.

@dpetran dpetran merged commit fbb8c41 into main Aug 14, 2024
@dpetran dpetran deleted the feature/sid-migration branch August 14, 2024 14:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants