Fix ComputeOutputsToUpload to build directory tree consistently#411
Fix ComputeOutputsToUpload to build directory tree consistently#411
Conversation
go/pkg/client/tree.go
Outdated
| treePb.Root = rootDir | ||
| for _, c := range childDirs { | ||
| treePb.Children = append(treePb.Children, c) | ||
| err = filepath.Walk(absPath, func(path string, info fs.FileInfo, err error) error { |
There was a problem hiding this comment.
I'm guessing you want treePb.Children to be ordered and you are using filepath.Walk() to guarantee the ordering? (certainly an approach I didn't really expect :D)
An alternative is to put the keys of childDirs into an array, sort the array and then iterate the array - that would also guarantee ordering. I think sorting the strings might be more efficient (agreed we use extra space) since filepath.Walk() has to issue stat calls inorder to walk the directory? Wdyt?
There was a problem hiding this comment.
Agreed, I think sorting is probably the better approach. Although childDirs will contain all the child nodes. These are not necessarily the direct children of root, so I sorted the order in which we iterate the directories in packageDirectories and saved that order.
There was a problem hiding this comment.
As discussed offline, I removed children from packageDirectories and appended the child nodes directly to the tree that is now being passed in.
The new test case I added should test that they are being appended in lexical order according to path. Swapping line 1671 with 1672 or 1680 with 1681 (want output) will fail the test because ComputeOutputsToUpload would output the digest with the correct lexical directory structure.
Failure example when swapping 1671 and 1672
tree_test.go:1867: ComputeOutputsToUpload(...) gave diff (-want +got) on tree children:
map[digest.Digest]*remoteexecution.Directory{
s"7478e0d4249ff4975c89427103939e36dc4d0a3078422aa53d53a9f339bcc064/156": s`files:{name:"bar" digest:{hash:"fcde2b2edba56bf408601fb721fe9b5`...,
s"778eefa0c2ce4f6c10fa950be47e0c930c91d21eff989c0d7d98eaae25d8f8ea/157": s`directories:{name:"dirD" digest:{hash:"7bcce45c82c21e91851bcc49`...,
s"78dc0850b3947e960b72ab23bcaa2c9506d6fa46c5ea88f9427c1c75d9e28d41/79": s`files:{name:"foo" digest:{hash:"2c26b46b68ffc68ff99b453c1d30413`...,
- s"7a2254e528f733780bd310d336c828ebdfc3f11dfe7be0bbfa68d136a8a0f51d/156": s`directories:{name:"dirF" digest:{hash:"78dc0850b3947e960b72ab23bcaa2c9506d6fa46c5ea88f9427c1c75d9e28d41" size_bytes:79}} directories:{name:"dirC" digest:{hash:"d5cadb0a45f1af229e10ba007aa6c68f8889bfe47a0e0b6b730833de52b199d6" size_bytes:77}}`,
s"7bcce45c82c21e91851bcc49b2e69953159e11d087f6996c6a385e0ebe1fc0df/77": s`files:{name:"baz" digest:{hash:"baa5a0964d3320fbc0c6a922140453c`...,
s"d5cadb0a45f1af229e10ba007aa6c68f8889bfe47a0e0b6b730833de52b199d6/77": s`files:{name:"bar" digest:{hash:"fcde2b2edba56bf408601fb721fe9b5`...,
+ s"ec3f070535f82e8c26fee750cb23d47141bc96f31119e37716dbe24313dd3a09/156": s`directories:{name:"dirC" digest:{hash:"d5cadb0a45f1af229e10ba007aa6c68f8889bfe47a0e0b6b730833de52b199d6" size_bytes:77}} directories:{name:"dirF" digest:{hash:"78dc0850b3947e960b72ab23bcaa2c9506d6fa46c5ea88f9427c1c75d9e28d41" size_bytes:79}}`,
The want and got digest contents here are the same except the got digest has dirC appended before dirF.
gkousik
left a comment
There was a problem hiding this comment.
Lets also test this with an Android build if possible before merging and ensure that for some sample action that mismatches, we get a consistent output directory digest.
ramymedhat
left a comment
There was a problem hiding this comment.
lgtm after comments and lint errors are addressed
I ran multiple Android builds and it looks like it's working. I don't see any mismatches across multiple builds with local/remote reruns set to 5. I also logged the digests to make sure we're not seeing anything weird and it looks like it's matching the remote digests. |
|
Some additional tests I did: test command: script.sh: With SDK changes in re-client: Without SDK changes in re-client: The test above shows that without the sdk changes the local digests are being produced inconsistently due to the subdirectories even though the file contents are the same. With the sdk changes, there doesn't seem to be any mismatches. #!/bin/bash With SDK changes in re-client: Without SDK changes in re-client: This test was just to make sure that the mismatch detection is working as intended. Seems like the different mismatches are still being picked up and logged to reproxy.ERROR. |
|
Thanks for the comprehensive testing! |
…lbuild#411) Changes ComputeOutputsToUpload to build the directory trees based on the lexical order of the directories according to path so that the directory digests will be consistent.
Currently the output directory tree's child nodes are appended randomly in
ComputeOutputsToUpload. This changes theComputeOutputsToUploadto build the directory trees based on the lexical order of the sub directories so that we will get a consistent output.The new test case added in
TestComputeOutputsToUploadDirectorieswould give the following result before the changes: