Skip to content

V3 idea: No longer allow Digest.size_bytes <= 0 #134

@EdSchouten

Description

@EdSchouten

Right now it is allowed to create Digest messages that have size_bytes == 0, referring to the empty blob. In #131 we're extending the protocol to require that the empty blob is always present, because it can be derived trivially. I personally find this a bit problematic:

  • It makes the protocol less regular and consistent.
  • Naïvely implemented client/servers will get this wrong. For example, what is FindMissingBlobs() on {hash: "e984d2bdd07318c4e29f7a2ceea4a9e4569e2d8e695a953a4e2df6f69fbdec95", size_bytes: 0} supposed to do? Report existence, because it has size zero? Or should it report absence, because the empty blob actually has SHA-256 sum e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855?
  • When digests for empty blobs are embedded in other messages, they still waste space. We still end up storing a SHA-256 sum.

I would like to suggest that we simply deny the existence of Digest messages with size_bytes <= 0. Any field where the empty blob needs to be referenced, we should use null. This means that the optimization that Bazel performs of not loading the empty blob becomes the norm, as there is no longer any way to even address the empty blob.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions