Skip to content

Item should link to commit#59

Merged
simonw merged 1 commit intosimonw:mainfrom
tomviner:item-refs-commit
Nov 20, 2022
Merged

Item should link to commit#59
simonw merged 1 commit intosimonw:mainfrom
tomviner:item-refs-commit

Conversation

@tomviner
Copy link
Contributor

@tomviner tomviner commented Oct 21, 2022

There's a bug in the non-id branch. Each item in items is redefined, with a new object, then _commit is set, but never used anywhere.

In my use case, I need to know the commit each item comes from, and this fix allows it:
image

@simonw
Copy link
Owner

simonw commented Nov 20, 2022

This is a good fix, thanks.

@simonw simonw merged commit b0e582a into simonw:main Nov 20, 2022
simonw added a commit that referenced this pull request Nov 20, 2022
@simonw
Copy link
Owner

simonw commented Nov 20, 2022

An interesting challenge with this change: since it modifies the schema, shipping a release with it could break existing databases the next time git-history file ... is run against them.

This would affect my workflow here for example: https://github.com/simonw/scrape-instances-social/blob/main/.github/workflows/scrape.yml

Options for doing this:

  • Document this as a breaking change (maybe even by shipping a 1.x version) and let people know how to work around it
  • Have the code tool itself learn how to modify the schema on existing tables to fix this

I'm leaning towards the second option, depending on how hard it will be to implement.

@tomviner tomviner deleted the item-refs-commit branch November 20, 2022 23:06
@simonw
Copy link
Owner

simonw commented Nov 20, 2022

Worth noting that this really was a bug: the code was designed to create that _commit column but failed to because it mutated a copy of the item from the array, not the original object:

for item in items:
item = jsonify_all(fix_reserved_columns(item))
item["_commit"] = commit_pk
db[item_table].insert_all(
items,
column_order=("_id",),
alter=True,
foreign_keys=(("_commit", "commits", "id"),),
)

@simonw
Copy link
Owner

simonw commented Nov 20, 2022

I think the fix is to detect if the item_table is missing that _commit column and add it.

@tomviner
Copy link
Contributor Author

The schema is only changed for those using the non---id branch (and worth fixing carefully for those users). But not in fact scrape-instances-social currently.

But as you've probably realised, this fix removes the need for the --id workaround you mention in tracking-mastodon. Stop setting --id, and use the simpler set of tables with a join to get the commit date:
image

simonw added a commit that referenced this pull request Jan 11, 2023
hueyy added a commit to hueyy/lacuna-db that referenced this pull request Aug 1, 2023
hueyy added a commit to hueyy/lacuna-db that referenced this pull request Apr 30, 2025
simonw added a commit that referenced this pull request Dec 21, 2025
simonw added a commit that referenced this pull request Dec 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants