feat(config): add Import.* for CID Profiles from IPIP-499#11148
feat(config): add Import.* for CID Profiles from IPIP-499#11148
Conversation
implements IPIP-499: add config options for controlling UnixFS DAG determinism and introduces `unixfs-v1-2025` and `unixfs-v0-2015` profiles for cross-implementation CID reproducibility. changes: - add Import.* fields: HAMTDirectorySizeEstimation, SymlinkMode, DAGLayout, IncludeEmptyDirectories, IncludeHidden - add validation for all Import.* config values - add unixfs-v1-2025 profile (recommended for new data) - add unixfs-v0-2015 profile (alias: legacy-cid-v0) - remove deprecated test-cid-v1 and test-cid-v1-wide profiles - wire Import.HAMTSizeEstimationMode() to boxo globals - update go.mod to use boxo with SizeEstimationMode support ref: https://specs.ipfs.tech/ipips/ipip-0499/
bf5578b to
d79f7de
Compare
add CLI flags for controlling file collection behavior during ipfs add: - `--dereference-symlinks`: recursively resolve symlinks to their target content (replaces deprecated --dereference-args which only worked on CLI arguments). wired through go-ipfs-cmds to boxo's SerialFileOptions. - `--empty-dirs` / `-E`: include empty directories (default: true) - `--hidden` / `-H`: include hidden files (default: false) these flags are CLI-only and not wired to Import.* config options because go-ipfs-cmds library handles input file filtering before the directory tree is passed to kubo. removed unused Import.UnixFSSymlinkMode config option that was defined but never actually read by the CLI. also: - wire --trickle to Import.UnixFSDAGLayout config default - update go-ipfs-cmds to v0.15.1-0.20260117043932-17687e216294 - add SYMLINK HANDLING section to ipfs add help text - add CLI tests for all three flags ref: ipfs/specs#499
d79f7de to
01b1ce0
Compare
add comprehensive test suite for UnixFS CID determinism per IPIP-499: - verify exact HAMT threshold boundary for both estimation modes: - v0-2015 (links): sum(name_len + cid_len) == 262144 - v1-2025 (block): serialized block size == 262144 - verify HAMT triggers at threshold + 1 byte for both profiles - add all deterministic CIDs for cross-implementation testing also wires SizeEstimationMode through CLI/API, allowing Import.UnixFSHAMTSizeEstimation config to take effect. bumps boxo to ipfs/boxo@6707376 which aligns HAMT threshold with JS implementation (uses > instead of >=), fixing CID determinism at the exact 256 KiB boundary.
Previously, resolving symlinks required two flags: - --dereference-args: resolved symlinks passed as CLI arguments - --dereference-symlinks: resolved symlinks inside directories Now --dereference-symlinks handles both cases. Users only need one flag to fully dereference symlinks when adding files to IPFS. The deprecated --dereference-args still works for backwards compatibility but is no longer necessary.
- update boxo to ebdaf07c (nil filter fix, thread-safety docs) - simplify changelog for IPIP-499 section - shorten test names, move context to comments
|
I may add more tests or improve code, but its ready for initial review, to course correct-early. |
gammazero
left a comment
There was a problem hiding this comment.
All code looks good, and it looks like all tests cases are covered.
Co-authored-by: Andrew Gillis <11790789+gammazero@users.noreply.github.com>
add test that confirms kubo uses balanced layout (all leaves at same depth) rather than balanced-packed (varying depths). creates 45MiB file to trigger multi-level DAG and walks it to verify leaf depth uniformity. includes trickle subtest to validate test logic can detect varying depths. supports CAR export via DAG_LAYOUT_CAR_OUTPUT env var for test vectors.
switches to ipfs/boxo@6141039 changes since 5cf22196ad0b: - refactor(unixfs): use arithmetic for exact block size calculation - refactor(unixfs): unify size tracking and make SizeEstimationMode immutable - feat(unixfs): optimize SizeEstimationBlock and add mode/mtime tests also clarifies that directory sharding globals affect both `ipfs add` and MFS.
- add UnixFSDataType() helper to directly check UnixFS type via protobuf - refactor threshold tests to use exact +1 byte calculations instead of +1 file - verify directory type directly (ft.TDirectory vs ft.THAMTShard) instead of inferring from link count - clean up helper function signatures by removing unused cidLength parameter
remove duplicate profile threshold tests from add_test.go since they are fully covered by the data-driven tests in cid_profiles_test.go. changes: - improve test names to describe what threshold is being tested - add inline documentation explaining each test's purpose - add byte-precise helper IPFSAddDeterministicBytes for threshold tests - remove ~200 lines of duplicated test code from add_test.go - keep non-profile tests (pinning, symlinks, hidden files) in add_test.go
…s-2025 # Conflicts: # docs/examples/kubo-as-a-library/go.mod # docs/examples/kubo-as-a-library/go.sum # go.mod # go.sum # test/dependencies/go.mod # test/dependencies/go.sum
3e4059b to
800cba9
Compare
|
Triage note:
|
- fix typo in files write help text - update boxo with CI fixes (gofumpt, race condition in test)
…s-2025 # Conflicts: # docs/examples/kubo-as-a-library/go.mod # docs/examples/kubo-as-a-library/go.sum # go.mod # go.sum # test/dependencies/go.mod # test/dependencies/go.sum
includes binary content types fix: gzip, zip, vnd.ipld.car, vnd.ipld.raw, vnd.ipfs.ipns-record
includes refactor of maxLinks check in addLinkChild (review feedback).
af565e3 to
eca0b5d
Compare
skip '@helia/mfs - should have the same CID after creating a file' test until helia implements IPIP-499 (tracking: ipfs/helia#941) the test fails because kubo now collapses single-block files to raw CIDs while helia explicitly uses reduceSingleLeafToSelf: false changes: - run aegir directly instead of helia-interop binary (binary ignores --grep flags) - cache node_modules keyed by @helia/interop version from npm registry - skip npm install on cache hit (matches ipfs-webui caching pattern)
eca0b5d to
a018d14
Compare
| - name: Install @helia/interop | ||
| if: steps.helia-cache.outputs.cache-hit != 'true' | ||
| run: npm install @helia/interop | ||
| # TODO(IPIP-499): Remove --grep --invert workaround once helia implements IPIP-499 |
There was a problem hiding this comment.
ℹ️ @achingbrain fyi I'm skipping that one test for now. while I wrote this is blocked until IPIP-499 is in Helia, seems that helia already bent backwards to simulate buggy behavior from Kubo (reduceSingleLeafToSelf: false). So maybe once 0.40.0-rc1 is tagged helia could switch to updated kubo and flip the flag and things will pass again (without waiting for IPIP-499)?
|
@gammazero small changes since your last review:
All CI checks passing. If no concerns I will merge tomorrow to unblock RC1. |
…s-2025 # Conflicts: # docs/examples/kubo-as-a-library/go.mod # docs/examples/kubo-as-a-library/go.sum # go.mod # go.sum # test/dependencies/go.mod # test/dependencies/go.sum
includes latest upstream changes from boxo main
92e51ed to
e69b33f
Compare
gammazero
left a comment
There was a problem hiding this comment.
All additional changes look good.
switches to boxo@main after merging ipfs/boxo#1088
switches to go-ipfs-cmds@master after merging ipfs/go-ipfs-cmds#315
7055ac1 to
0284f7b
Compare
|
Thanks! Switched to boxo and cmds from their respective master branches. Moving forward, we will test and gather feedback during 0.40 RC1. |
Implements IPIP-499: UnixFS CID Determinism
Depends on:
Closes #11071
Changes
CID Profiles
Apply a profile to pin down import settings for reproducible CIDs:
Available profiles:
unixfs-v1-2025: modern defaults (CIDv1, sha2-256, raw leaves, 1 MiB chunks)unixfs-v0-2015(alias:legacy-cid-v0): legacy CIDv0 behaviorRemoves deprecated
test-cid-v1andtest-cid-v1-wideprofiles.New Config Options
Import.UnixFSHAMTDirectorySizeEstimation: HAMT threshold mode (links,block,disabled)Import.UnixFSDAGLayout:balanced(+ optional trickle) but in the future we could have othersMFS Improvements
ipfs filescommands now respectImport.*config:Fix: single-block files in CIDv1 directories now produce raw CIDs (matching
ipfs addbehavior)New CLI Flags
--dereference-symlinks: resolve all symlinks to target content--empty-dirs/-E: include empty directories--hidden/-H: include hidden filesDeprecates
--dereference-args(subsumed by--dereference-symlinks).Tests