Skip to content

feat!: add hash_kind to TreeRefIter and Data#2497

Merged
Sebastian Thiel (Byron) merged 4 commits into
GitoxideLabs:mainfrom
cruessler:pass-hash-len-to-tree-ref-iter
Apr 20, 2026
Merged

feat!: add hash_kind to TreeRefIter and Data#2497
Sebastian Thiel (Byron) merged 4 commits into
GitoxideLabs:mainfrom
cruessler:pass-hash-len-to-tree-ref-iter

Conversation

@cruessler

@cruessler Christoph Rüßler (cruessler) commented Mar 31, 2026

Copy link
Copy Markdown
Contributor

This PR is an attempt at making the hash length in gix_object::tree::ref_iter::decode depend on the decoded tree’s hash’s size. Previously, it was hardcoded to 20 which is only correct for SHA1, but not for SHA256.

There were quite a few follow-up changes as TreeRef::from_bytes as well as TreeRefIter::from_bytes now take hash_kind as an argument. Also, since TreeRef can be instantiated in Data::decode, Data now also has to carry hash_kind. Another option that I didn’t explore would have been to add hash_kind as an argument to Data::decode.

Update 2026-04-04: I changed the initial addition of hash_len to hash_kind. Tests are now passing, with the exception of the fuzzer having found an issue.

Review Tasks

  • fix CI and cleanup history
  • review
  • refactor

@Byron Sebastian Thiel (Byron) left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot! I think it's great to see that overall, passing the hash kind through isn't crazy. I also left a few comments which I hope can lead to a small direction change.

Comment thread gix-object/src/object/mod.rs Outdated
Comment thread gix-object/src/tree/ref_iter.rs Outdated
Comment thread gix-object/src/lib.rs Outdated
Comment thread gix-object/src/lib.rs Outdated
@cruessler Christoph Rüßler (cruessler) changed the title feat!: add hash_len to TreeRefIter and Data feat!: add hash_kind to TreeRefIter and Data Apr 4, 2026
@cruessler Christoph Rüßler (cruessler) marked this pull request as ready for review April 4, 2026 09:20
@cruessler

Copy link
Copy Markdown
Contributor Author

Sebastian Thiel (@Byron) I made some changes based on your feedback. The PR contains a lot of small commits now, feel free to shuffle them around as you see fit!

The fuzzer seems to have found something, though at first sight, I wasn’t sure what to make of that finding. I’ll try figuring out what this is about, but maybe you’re more familiar with it and can quickly trace it back to one of my changes.

I added one TODO(SHA256) in gix-diff-tests. I plan on addressing it in the follow-up PR that makes gix-diff-tests run with SHA256 (we don’t do that yet).

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7383f0210e

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "Codex (@codex) review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "Codex (@codex) address that feedback".

Comment thread gix/src/object/tree/mod.rs Outdated
@Byron Sebastian Thiel (Byron) force-pushed the pass-hash-len-to-tree-ref-iter branch 2 times, most recently from 3586649 to 3430e49 Compare April 19, 2026 23:07
- fix fuzz test
- document `hash_kind` parameter
- minor refactors

Co-authored-by: Sebastian Thiel <sebastian.thiel@icloud.com>
Co-authored-by: Sebastian Thiel <sebastian.thiel@icloud.com>

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates gitoxide’s object decoding to correctly handle non-SHA1 repositories by making tree entry hash sizes depend on the decoded tree’s hash kind (e.g., 32 bytes for SHA-256 instead of a hardcoded 20). This propagates hash_kind through core borrowed-object APIs (TreeRefIter, TreeRef::from_bytes, Data, and ObjectRef parsing) and updates downstream callers and tests accordingly.

Changes:

  • Add hash_kind to gix_object::TreeRefIter and gix_object::Data, and thread it through tree decoding/iteration and object decoding entrypoints.
  • Update call sites across gix, gix-pack, gix-odb, gix-ref, gix-merge, and gitoxide-core to provide the correct hash kind.
  • Extend/adjust tests and fixtures (including SHA-256 fixtures) and make diff test snapshots hash-agnostic.

Reviewed changes

Copilot reviewed 46 out of 50 changed files in this pull request and generated no comments.

Show a summary per file
File Description
justfile Runs gix-diff-tests with SHA-256 fixtures in unit-tests recipe.
gix/tests/gix/object/tree/mod.rs Adds regression test ensuring tree decode uses the tree id’s hash kind.
gix/src/repository/impls.rs Populates hash_kind when producing empty-tree Data.
gix/src/repository/diff.rs Passes tree id hash kind into TreeRefIter::from_bytes() for diffing.
gix/src/object/tree/traverse.rs Passes root tree id hash kind into traversal iterator.
gix/src/object/tree/mod.rs Uses tree id hash kind for decode/iteration and for Data::new() during lookups.
gix/src/object/tree/diff/for_each.rs Passes per-tree hash kind into diff iterator construction.
gix/src/object/mod.rs Constructs gix_object::Data with id-derived hash kind for decoding helpers.
gix-traverse/src/tree/depthfirst.rs Supplies hash kind to TreeRefIter::from_bytes() during traversal.
gix-ref/tests/refs/file/mod.rs Test Find impl now sets Data.hash_kind from requested id.
gix-ref/src/store/file/raw_ext.rs Adjusts Data destructuring for added hash_kind field.
gix-pack/tests/pack/data/input.rs Test objects and Find impl now set hash_kind from env/id.
gix-pack/src/index/verify.rs Uses new ObjectRef::from_bytes(data, kind, hash_kind) signature.
gix-pack/src/index/traverse/mod.rs Builds Data with id-derived hash kind before checksum verification.
gix-pack/src/data/output/count/objects/mod.rs Uses obj.hash_kind when iterating trees.
gix-pack/src/bundle/find.rs Sets Data.hash_kind from pack object hash.
gix-odb/src/store_impls/loose/find.rs Sets Data.hash_kind from requested id.
gix-odb/src/store_impls/dynamic/find.rs Sets Data.hash_kind from pack object hash in dynamic store.
gix-odb/src/memory.rs Sets Data.hash_kind from requested id.
gix-odb/src/cache.rs Creates cached Data with id.kind() as hash kind.
gix-object/tests/object/tree/iter.rs Updates tree iterator tests to supply hash kind (from env).
gix-object/tests/object/tree/from_bytes.rs Updates tree decode tests to supply hash kind (from env).
gix-object/tests/object/tree/entries.rs Updates tree parsing tests to supply hash kind (from env).
gix-object/tests/object/tree/editor.rs Test Find impl now sets Data.hash_kind from requested id.
gix-object/tests/object/tag.rs Adds test for invalid target id length in tag parsing.
gix-object/tests/object/object_ref.rs Updates ObjectRef::from_loose() tests for new signature.
gix-object/tests/object/encode.rs Adds hash-kind-aware roundtrip macro and updates tree roundtrip.
gix-object/tests/object/commit/mod.rs Adds test for invalid object id length in commit parsing.
gix-object/src/tree/ref_iter.rs Core change: TreeRefIter carries hash kind; tree decoding uses hash-len parameter.
gix-object/src/parse.rs Tightens hex_hash parsing to only accept lengths matching enabled hash kinds.
gix-object/src/object/mod.rs ObjectRef::from_loose/from_bytes now require hash kind to decode trees correctly.
gix-object/src/lib.rs Updates docs and adds hash_kind fields to TreeRefIter and Data structs.
gix-object/src/data.rs Data::new() signature changed to include hash kind; tree decode uses it.
gix-object/fuzz/fuzz_targets/fuzz_tree.rs Fuzzes tree decoding for both SHA-1 and SHA-256 hash kinds.
gix-object/fuzz/Cargo.toml Enables gix-hash sha1+sha256 in fuzz crate dependencies.
gix-object/benches/edit_tree.rs Bench Find impl now sets Data.hash_kind.
gix-object/benches/decode_objects.rs Bench uses env-selected hash kind for tree parsing benchmarks.
gix-merge/tests/merge/blob/mod.rs Test Find impl now sets Data.hash_kind from requested id.
gix-merge/src/tree/function.rs Supplies base tree hash kind to TreeRefIter::from_bytes().
gix-diff/tests/fixtures/generated-archives/make_blob_repo_sha256.tar Adds SHA-256 fixture archive for diff tests.
gix-diff/tests/fixtures/generated-archives/.gitignore Ignores newly added SHA-256 generated fixture tar.
gix-diff/tests/diff/tree_with_rewrites.rs Makes snapshots hash-agnostic and opens ODB with correct object hash kind.
gix-diff/tests/diff/tree.rs Uses hash-agnostic assertions and shared ODB-opening helper.
gix-diff/tests/diff/main.rs Adds fixture hash-kind detection, ODB open helper, and snapshot normalization utilities.
gix-diff/tests/diff/index.rs Updates snapshots/assertions and expected ids for SHA-1 vs SHA-256 fixtures.
gix-diff/tests/Cargo.toml Enables gix-hash sha1+sha256 features for diff test crate.
gitoxide-core/src/index/checkout.rs Ensures Data.hash_kind is set in custom Find implementations.

@Byron Sebastian Thiel (Byron) merged commit 7d50c30 into GitoxideLabs:main Apr 20, 2026
36 checks passed
@Byron

Copy link
Copy Markdown
Member

Thanks so much!

Please note that I accidentally added gix-diff SHA-256 tests, please double-check if I am missing something obvious.

@cruessler

Copy link
Copy Markdown
Contributor Author

Thanks so much!

Please note that I accidentally added gix-diff SHA-256 tests, please double-check if I am missing something obvious.

No worries! I’ve created #2532 as a small follow-up.

Eliah Kagan (EliahKagan) added a commit to GitoxideLabs/cargo-smart-release that referenced this pull request Jun 4, 2026
Update call sites for gix 0.83's API: `SignatureRef::from_bytes` now
returns a `Result` rather than an `Option` [1], and
`TreeRefIter::from_bytes` takes the object hash kind [2]. The crate sets
`default-features = false` on gix, so also enable the `sha1` feature,
which `gix-hash` no longer enables by default [3].

For `TreeRefIter::from_bytes`, pass the hash kind of the tree id already
in hand (`id.kind()`), matching how gitoxide's own tree API derives it.

Make the `Item` size guard robust to the enabled hash features instead of
asserting an absolute byte count. `gix::ObjectId` is a SHA-1/SHA-256 enum
whose width depends on which `gix-hash` features are unified into the
build; the gix-testtools dev-dependency enables `gix-hash/sha256` in test
builds, widening it there but not in production. Bound only the part we
own rather than the gix-controlled fields.

[1]: GitoxideLabs/gitoxide@91c854e (GitoxideLabs/gitoxide#2546)
[2]: GitoxideLabs/gitoxide@3b156e8 (GitoxideLabs/gitoxide#2497)
[3]: GitoxideLabs/gitoxide@1019b42 (GitoxideLabs/gitoxide#2441)

Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com>
Eliah Kagan (EliahKagan) added a commit to GitoxideLabs/cargo-smart-release that referenced this pull request Jun 4, 2026
Update call sites for gix 0.83's API: `SignatureRef::from_bytes` now
returns a `Result` rather than an `Option` [1], and
`TreeRefIter::from_bytes` takes the object hash kind [2]. The crate sets
`default-features = false` on gix, so also enable the `sha1` feature,
which `gix-hash` no longer enables by default [3].

For `TreeRefIter::from_bytes`, pass the hash kind of the tree id already
in hand (`id.kind()`), matching how gitoxide's own tree API derives it.

Update the `Item` size guard from 200 to 240 bytes. The gix-testtools
dev-dependency enables `gix-hash/sha256` in test builds (where this guard
runs), so `gix::ObjectId` is its wider SHA-1/SHA-256 enum (33 bytes) and
`Item` holds three of them; a production (sha1-only) build is ~200.

[1]: GitoxideLabs/gitoxide@91c854e (GitoxideLabs/gitoxide#2546)
[2]: GitoxideLabs/gitoxide@3b156e8 (GitoxideLabs/gitoxide#2497)
[3]: GitoxideLabs/gitoxide@1019b42 (GitoxideLabs/gitoxide#2441)

Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Eliah Kagan (EliahKagan) added a commit to GitoxideLabs/cargo-smart-release that referenced this pull request Jun 4, 2026
Update call sites for gix 0.83's API: `SignatureRef::from_bytes` now
returns a `Result` rather than an `Option` [1], and
`TreeRefIter::from_bytes` takes the object hash kind [2]. The crate sets
`default-features = false` on gix, so also enable the `sha1` feature,
which `gix-hash` no longer enables by default [3].

For `TreeRefIter::from_bytes`, pass the hash kind of the tree id already
in hand (`id.kind()`), matching how gitoxide's own tree API derives it.

Update the `Item` size guard from 200 to 240 bytes. The gix-testtools
dev-dependency enables `gix-hash/sha256` in test builds (where this guard
runs), so `gix::ObjectId` is its wider SHA-1/SHA-256 enum (33 bytes) and
`Item` holds three of them; a production (sha1-only) build is ~200.

[1]: GitoxideLabs/gitoxide#2546
[2]: GitoxideLabs/gitoxide@3b156e8 (GitoxideLabs/gitoxide#2497)
[3]: GitoxideLabs/gitoxide@1019b42 (GitoxideLabs/gitoxide#2441)

Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Eliah Kagan (EliahKagan) added a commit to GitoxideLabs/cargo-smart-release that referenced this pull request Jun 4, 2026
Update call sites for gix 0.83's API: `SignatureRef::from_bytes` now
returns a `Result` rather than an `Option` [1], and
`TreeRefIter::from_bytes` takes the object hash kind [2]. The crate sets
`default-features = false` on gix, so also enable the `sha1` feature,
which `gix-hash` no longer enables by default [3].

For `TreeRefIter::from_bytes`, pass the hash kind of the tree id already
in hand (`id.kind()`), matching how gitoxide's own tree API derives it.

Update the `Item` size guard from 200 to 240 bytes. The gix-testtools
dev-dependency enables `gix-hash/sha256` in test builds (where this guard
runs), so `gix::ObjectId` is its wider SHA-1/SHA-256 enum (33 bytes) and
`Item` holds three of them; a production (sha1-only) build is ~200.

[1]: GitoxideLabs/gitoxide#2546
[2]: GitoxideLabs/gitoxide@3b156e8 (GitoxideLabs/gitoxide#2497)
[3]: GitoxideLabs/gitoxide@1019b42 (GitoxideLabs/gitoxide#2441)

Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Eliah Kagan (EliahKagan) added a commit to GitoxideLabs/cargo-smart-release that referenced this pull request Jun 4, 2026
Update call sites for gix 0.83's API: `SignatureRef::from_bytes` now
returns a `Result` rather than an `Option` [1], and
`TreeRefIter::from_bytes` takes the object hash kind [2]. The crate sets
`default-features = false` on gix, so also enable the `sha1` feature,
which `gix-hash` no longer enables by default [3].

For `TreeRefIter::from_bytes`, pass the hash kind of the tree id already
in hand (`id.kind()`), matching how gitoxide's own tree API derives it.

Update the `Item` size guard from 200 to 240 bytes. The gix-testtools
dev-dependency enables `gix-hash/sha256` in test builds (where this guard
runs), so `gix::ObjectId` is its wider SHA-1/SHA-256 enum (33 bytes) and
`Item` holds three of them; a production (sha1-only) build is ~200.

[1]: GitoxideLabs/gitoxide@b060eb2,
GitoxideLabs/gitoxide@91c854e (GitoxideLabs/gitoxide#2546)
[2]: GitoxideLabs/gitoxide@3b156e8 (GitoxideLabs/gitoxide#2497)
[3]: GitoxideLabs/gitoxide@1019b42 (GitoxideLabs/gitoxide#2441)

Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Eliah Kagan (EliahKagan) added a commit to GitoxideLabs/cargo-smart-release that referenced this pull request Jun 4, 2026
Update call sites for gix 0.83's API: `SignatureRef::from_bytes` now
returns a `Result` rather than an `Option` [1], and
`TreeRefIter::from_bytes` takes the object hash kind [2]. The crate sets
`default-features = false` on gix, so also enable the `sha1` feature,
which `gix-hash` no longer enables by default [3].

For `TreeRefIter::from_bytes`, pass the hash kind of the tree id already
in hand (`id.kind()`), matching how gitoxide's own tree API derives it.

Update the `Item` size guard from 200 to 240 bytes. The gix-testtools
dev-dependency enables `gix-hash/sha256` in test builds (where this guard
runs), so `gix::ObjectId` is its wider SHA-1/SHA-256 enum (33 bytes) and
`Item` holds three of them; a production (sha1-only) build is ~200.

[1]: GitoxideLabs/gitoxide@b060eb2,
     GitoxideLabs/gitoxide@91c854e (GitoxideLabs/gitoxide#2546)
[2]: GitoxideLabs/gitoxide@3b156e8 (GitoxideLabs/gitoxide#2497)
[3]: GitoxideLabs/gitoxide@1019b42 (GitoxideLabs/gitoxide#2441)

Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Eliah Kagan (EliahKagan) added a commit to GitoxideLabs/cargo-smart-release that referenced this pull request Jun 4, 2026
Update call sites for gix 0.83's API: `SignatureRef::from_bytes` is no
longer generic over the error type and returns a concrete-error `Result`
[1], and `TreeRefIter::from_bytes` takes the object hash kind [2]. The
crate sets `default-features = false` on gix, so also enable the `sha1`
feature, which `gix-hash` no longer enables by default [3].

For `TreeRefIter::from_bytes`, pass the hash kind of the tree id already
in hand (`id.kind()`), matching how gitoxide's own tree API derives it.

Update the `Item` size guard from 200 to 240 bytes. The gix-testtools
dev-dependency enables `gix-hash/sha256` in test builds (where this
guard runs), so `gix::ObjectId` is its wider SHA-1/SHA-256 enum (33
bytes) and `Item` holds three of them; a production (sha1-only) build
is ~200.

[1]: GitoxideLabs/gitoxide@b060eb2,
     GitoxideLabs/gitoxide@91c854e (GitoxideLabs/gitoxide#2546)
[2]: GitoxideLabs/gitoxide@3b156e8 (GitoxideLabs/gitoxide#2497)
[3]: GitoxideLabs/gitoxide@1019b42 (GitoxideLabs/gitoxide#2441)

Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants