V3 idea: Loosen the restriction of action input as Merkle tree

I have observed that Bazel can spend a lot of CPU resources calculating merkel tree digests. This has been discussed in https://github.com/bazelbuild/bazel/issues/10875 and [Extend the Action Cache with alias digests](https://groups.google.com/forum/#!topic/remote-execution-apis/F0Qb4m0J4Vg).

The key point is that the single input Merkle tree only needs to be resolved on cache miss, which should be rare, so the client should be allowed to check for cache hit using something else.

One idea was to create an alias cache entry where the client would be able to calculate the digest in any suitable way. The problem is that the alias has to be uploaded by the clients, a trusted CI machine or an untrusted developer machine, but not by the remote execution server side. Therefore, using action cache alias makes the system vulnerable for cache poisoning.

Instead, @EricBurnett suggests to loosen the restriction on the input to describe partial trees:
> https://github.com/bazelbuild/remote-apis/issues/140#issuecomment-636983411
> For merkle trees as inputs, the general properties we care about are:
> 
>    - Recursively defined, so that sharing trees in inputs doesn't require
>    operating on a whole tree each time
>    - Parallelly uploadable, so that it doesn't add unnecessary round-trips
>    on the order of the depth of the tree.

> https://groups.google.com/forum/#!msg/remote-execution-apis/F0Qb4m0J4Vg/QANi1BMdAgAJ
> I will note that Merkle Trees, when used as inputs, are defined as they are to achieve:
> 1. Reusability (sub-trees shared by two actions will share Merkle Tree nodes),
> 2. Determinism (the same set of inputs will always get the same tree, regardless of client)

What would be a good design?
1. Extend `message Directory` to include more extra roots, not just subdirectories?
2. Let `Action.input_root_digest` be repeated?

Any other design ideas or any ideas to solve the problem in a totally different way?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

V3 idea: Loosen the restriction of action input as Merkle tree #141

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

V3 idea: Loosen the restriction of action input as Merkle tree #141

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions