Skip to content

Make bufimagebuild.Builder take a Module instead of a ModuleFileSet#2212

Merged
bufdev merged 2 commits intomainfrom
image-builder-to-module
Jun 20, 2023
Merged

Make bufimagebuild.Builder take a Module instead of a ModuleFileSet#2212
bufdev merged 2 commits intomainfrom
image-builder-to-module

Conversation

@bufdev
Copy link
Member

@bufdev bufdev commented Jun 19, 2023

A very common operation we want to do throughout our codebases is go from a Module to an Image. However, you can't - the question would be, why not?

The answer is that we have an intermediate state, a ModuleFileSet, that "flattens" a Module from what amounts to a set of .proto files and a lock file, to a set of combined .proto files that represent both the module's .proto files and the dependencies' .proto files. This is what we pass to i.e. protocompile. Currently, to compile a Module, you first need to make a ModuleFileSet, and then pass the ModuleFileSet to the bufimagebuild.Builder.

This can make some kind of sense in the abstract, but in practice, 99.9% of our usage of bufimagebuild.Image is someone creating a ModuleFileSetBuilder right before, creating a ModuleFileSet, and then creating an Image. This is wasteful and confusing to those who don't fully understand the (convoluted) type system in bufmodule - you should be able to go right from a Module to an Image.

On top of this, confusingly, ModuleFileSet inherits from Module, which causes a host of issues. There are places in the code where one might be expecting a Module, but are instead passed a ModuleFileSet, which stealthily modifies the behavior of GetModuleFile such that files from both the sources and dependencies will be returned. You could, in fact, also create a ModuleFileSet from a ModuleFileSet, recursive behavior that our code does not actually handle (and will result in a bug from duplicate files).

It's been a longstanding TODO in the codebase to eliminate ModuleFileSet altogether, and instead just use Modules, perhaps with a function on Module AllFiles() that returns what amounts to a ModuleFileBucket, a type that will be completely distinct from Module.

This PR does the first step of this: it changes bufimagebuild.Builder to take a Module instead of a ModuleFileSet, and then just inlines the (very) simple operation that ModuleFileSet does. To do this, bufimagebuild.Builder now takes a ModuleReader directly. In practice, in our codebase, this eliminates a lot of extra code to use a ModuleFileSet, and makes it easier for future developers to understand how to do from Modules to Images.

One thing that is a little more complicated with this is some unit testing we do around modules and their dependencies - often, you want to pass a manually-built ModuleFileSet to the bufimagebuild.Builder. To help with this, new functions have been added to bufmoduletesting with examples to provide functionality to build test lock files and test ModuleReaders.

One pain point is buf export, which actually needs both the equivalent of ModuleFileSets, and the ability to build an Image, but this is minor and can be dealt with when ModuleFileSets are actually eliminated. However, for now during this migration, bufimagebuild.Builder handles the case if a Module is actually a ModuleFileSet, and does not call the ModuleFileSetBuilder in this case.

A PR in core will be done in conjunction with this PR to handle the new behavior.

This PR also reverts the addition of NewReadWriteBucketWithOptions in storagemem, which isn't consistent with the rest of our codebase.

@bufdev bufdev requested a review from doriable June 19, 2023 18:03
@bufdev
Copy link
Member Author

bufdev commented Jun 19, 2023

Note this PR could be considered somewhat controversial, so we should discuss tomorrow. I can see arguments both ways on it.

@bufdev
Copy link
Member Author

bufdev commented Jun 19, 2023

Some notes for discussion:

  • This PR may be solving the secondary problem (simplifying the 99% case for the compiler) and not the primary problem (ModuleFileSet should not inherit from Module). It's nice to make it so that most people don't have to worry about ModuleFileSets at all, but the real ticking time bomb is the inheritance. We could make it, for example, where Module produces a non-derivative "flattened" type that just allows access to all .proto files, but doesn't inherit from Module.
  • There's three cases where you want something like ModuleFileSet: exporting, compiling, and testing. buf export can be special-cased, the compiler can be simplified as above, but testing is the real issue right now. Having gone through this and the correspondent PR in core, it's very obvious that our testing infrastructure for Modules is very lacking. We don't have a good way to build up test Modules, especially with dependencies, and a good way to have test ModuleReaders. We should work to simplify this massively, and a lot of the pain here might go away.
  • Separately, we should consider trying to mostly deprecate ModulePin, and replace it with a new type ModuleCommit, that just has the commit on it. It would make creating test ModuleReaders much simpler, and would simplify a ton of the code. There's a lot of stuff on ModulePin that 99% of people shouldn't be concerned with, and it makes testing a lot harder.

The storagemem stuff could/should be split out into a separate PR if we pursue this.

@bufdev
Copy link
Member Author

bufdev commented Jun 19, 2023

Other comment: the inheritance throughout the types in bufmoduleref is a ticking time bomb as well.

@bufdev
Copy link
Member Author

bufdev commented Jun 20, 2023

Talked offline: we're merging this for now as it's a net win, and will continue to make improvements as we go. Next up is a better module testing framework, and eliminating ModuleFileSets altogether.

@bufdev bufdev merged commit 6ddf4be into main Jun 20, 2023
@bufdev bufdev deleted the image-builder-to-module branch June 20, 2023 17:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants