switch to new merkledag walk functions#6499
Conversation
|
You already merged ipfs/go-merkledag#41 so I'll comment here. First, considering how easy it is to add the root cid before using a graph walking function, I'd say that enumerating only children is more useful/clean as a reusable algorithm. Filtering the root after the fact works, but I'm sure you'll agree it's not that great. With the renaming forcing a re-evaluation of all the call sites, what about providing a more complete set of functions ?
At Infura, we actually have an additional need: we have a bunch of bad blocks that make walking the graph fail. An example: Having a way to make those walk not fail when a bad block is found would be very useful. Now, having 16 functions is no fun either, so maybe it's time to leverage the functional options pattern ? It would also allow to get more control for, say, the concurrency factor, a depth limit ... type WalkOptions struct {
WithRoot bool
MaxDepth int
StopOnError bool
Concurrency int
}
type WalkOption func(*WalkOptions)
func WithRoot() WalkOption {
return func(walkOptions *WalkOptions) {
walkOptions.WithRoot = true
}
}
func ContinueOnError() WalkOption {
return func(walkOptions *WalkOptions) {
walkOptions.StopOnError = false
}
}
func Concurrent() WalkOption {
return func(walkOptions *WalkOptions) {
walkOptions.Concurrency = defaultConcurrentFetch
}
}
func Concurrency(worker int) WalkOption {
return func(walkOptions *WalkOptions) {
walkOptions.Concurrency = worker
}
}
...We would end up with only two base functions ( func Walk(ctx context.Context, getLinks GetLinks, c cid.Cid, visit func(cid.Cid) bool, options ...WalkOption) error {
return nil
}
func WalkDepth(ctx context.Context, getLinks GetLinks, c cid.Cid, visit func(cid.Cid, int) bool, options ...WalkOption) error {
return nil
} |
|
@Stebalien what do you think about my proposal ? |
michaelavila
left a comment
There was a problem hiding this comment.
This reads much better. I ran CI again as the sharness tests had some unrelated failures (they went away). FWIW, I also think @MichaelMure's proposal is interesting. I like the idea of capturing some of these things as functional options.
|
In general, I completely agree that we need better dag traversal functions. The goal of this PR was to fix the immediate inconsistencies. I chose
Note: IPLD actually has a more complicated read-ahead, in-order walker: https://godoc.org/github.com/ipfs/go-ipld-format#Walker (example https://github.com/ipfs/go-unixfs/blob/06cb1c76adea6c76310b8b8c15444d06db02e745/io/dagreader.go#L171). However, it's complicated enough that I haven't bothered to try using it. |
c39e608 to
bfd56d7
Compare
EnumerateChildrenAsync has been renamed to WalkParallel to reflect the fact that: 1. It visits the root. 2. It's parallel, not async. To mirror this change, EnumerateChildren has also been renamed to Walk and now behaves the same (except that it's not parallel).
bfd56d7 to
9738d81
Compare
|
I'm going to merge this as it strictly improves things. |
EnumerateChildrenAsync has been renamed to WalkParallel to reflect the fact that:
To mirror this change, EnumerateChildren has also been renamed to Walk and now behaves the same (except that it's not parallel).