Skip to content

Allow previous solve results to be used in new solves#1286

Merged
tonistiigi merged 5 commits intomoby:masterfrom
hinshun:llbstate-from-result
Jan 24, 2020
Merged

Allow previous solve results to be used in new solves#1286
tonistiigi merged 5 commits intomoby:masterfrom
hinshun:llbstate-from-result

Conversation

@hinshun
Copy link
Collaborator

@hinshun hinshun commented Dec 12, 2019

Depends on #1269
Fixes #751

Edit:

This PR introduces the ability to essentially marshal and unmarshal between llb.State and pb.Definition. In addition, it also allows using previously solved client.Result to be used as an input to a new build graph.

For example, we can do some parallel solve requests, and then use their results in a new graph:

eg, ctx := errgroup.WithContext(ctx)

inputs := []llb.State{
	llb.Scratch().File(llb.Mkfile("/out", 0600, []byte("foo"))),
	llb.Scratch().File(llb.Mkfile("/out", 0600, []byte("bar"))),
}

var mu sync.Mutex
outputs := make([]llb.State, len(inputs))

for i, st := range inputs {
	i, st := i, st
	eg.Go(func() error {
		def, err := st.Marshal()
		if err != nil {
			return err
		}

		res, err := c.Solve(ctx, client.SolveRequest{
			Definition: def.ToPB(),
		})
		if err != nil {
			return err
		}

		ref, err := res.SingleRef()
		if err != nil {
			return err
		}

		output, err := ref.ToState()
		if err != nil {
			return err
		}

		mu.Lock()
		outputs[i] = output
		mu.Unlock()
		return nil
	})
}

err := eg.Wait()
if err != nil {
	return err
}

output := llb.Scratch()

for i, output := range outputs {
	output = output.File(
		llb.Copy(output, "/out", "/out"+i),
	)
}

def, err := output.Marshal()
if err != nil {
	return err
}

res, err := c.Solve(ctx, client.SolveRequest{
	Definition: def.ToPB(),
})

// ...

In order for previous solve results (client.Result).SingleRef to be used as an argument for llb.NewState, client.Reference must implement llb.Output. However, there is an import cycle related to gw.ResolveImageConfigOpt which I moved to the llb package. Let me know if you prefer this as a separate PR.

client.Reference implements llb.Output with the help of the pb.Definition of the previous solve. When the new solve references an Op from a previous solve, it must still traverse the graph so the marshaled definitions responsible for the previous solve's result must be included in the new one. We implemented in a way such that users are not exposed to this detail.


Questions

  1. When the *llbBridge handles a definition-based solve, it loads an edge from the definition:
    edge, err := Load(req.Definition, dpc.Load, ValidateEntitlements(ent), WithCacheSources(cms), RuntimePlatforms(b.platforms), WithValidateCaps())
    if err != nil {
    return nil, errors.Wrap(err, "failed to load LLB")
    }

Definitions end with an empty op ((pb.Op).Op == nil) with a single input pointing to the last Op. The Load function returns the edge referencing this last Op:

dgst = lastOp.Inputs[0].Digest

Which is eventually built by a solver.Job:

ref, err := b.builder.Build(ctx, edge)
if err != nil {
return nil, errors.Wrap(err, "failed to build LLB")
}
res = &frontend.Result{Ref: ref}

The result that gets returned is the llb.Vertex we want to return, and we need it to also carry the full definition up to that vertex. It seems to only make sense to plumb the definition on the final result so I added req.Definition to the frontend.Result. Does this make sense? I'm not sure how this interplays with multi-result arrays.

  1. I think part of my confusion is I don't understand multi-result arrays too well. What is an example of a solve that returns multiple results? And how can it be utilized?

Todo

  • Forward definition of frontend-based solves
  • Tests
  • Handle validation and constraints correctly

@GordonTheTurtle
Copy link
Collaborator

Please sign your commits following these rules:
https://github.com/moby/moby/blob/master/CONTRIBUTING.md#sign-your-work
The easiest way to do this is to amend the last commit:

$ git clone -b "llbstate-from-result" git@github.com:hinshun/buildkit.git somewhere
$ cd somewhere
$ git rebase -i HEAD~842354600464
editor opens
change each 'pick' to 'edit'
save the file and quit
$ git commit --amend -s --no-edit
$ git rebase --continue # and repeat the amend for each commit
$ git push -f

Amending updates the existing PR. You DO NOT need to open a new one.

@hinshun
Copy link
Collaborator Author

hinshun commented Dec 18, 2019

Currently, the DefinitionOp implements Marshal by just remarshalling the Op the same way it's in the definition. However, I was bitten by this edge case in SourceOp that sets the op.Platform = nil if it's not a docker image:

if !platformSpecificSource(s.id) {
proto.Platform = nil
}

It's implemented this way because the helper function MarshalConstraint only overrides if the platform is not nil:

func MarshalConstraints(base, override *Constraints) (*pb.Op, *pb.OpMetadata) {
c := *base
c.WorkerConstraints = append([]string{}, c.WorkerConstraints...)
if p := override.Platform; p != nil {
c.Platform = p
}

I can either duplicate this edge case in DefinitionOp, or implement Marshal by writing code to unmarshal a pb.Op to each of their respective ExecOp, SourceOp, etc, and then calling their Marshal function. Currently I'm leaning on duplicating the edge case.

@hinshun hinshun force-pushed the llbstate-from-result branch 2 times, most recently from 358e813 to 6e8c830 Compare December 18, 2019 20:02
@hinshun hinshun marked this pull request as ready for review December 18, 2019 20:05
@hinshun hinshun force-pushed the llbstate-from-result branch from 6e8c830 to fa4ac7a Compare December 20, 2019 00:35
@tonistiigi
Copy link
Member

Currently, the DefinitionOp implements Marshal by just remarshalling the Op the same way it's in the definition. However, I was bitten by this edge case in SourceOp that sets the op.Platform = nil if it's not a docker image:

But don't call pop.Marshal at all inside definitionop.Marshal(). Instead save the raw bytes same way as you store ops in a map atm. and return the original bytes directly back instead of calling Marshal again.

@hinshun
Copy link
Collaborator Author

hinshun commented Dec 22, 2019

@tonistiigi You're right! That was dumb... 👍

@hinshun
Copy link
Collaborator Author

hinshun commented Jan 12, 2020

@tonistiigi Could you give this another review?

@tonistiigi tonistiigi self-requested a review January 13, 2020 02:40
@tonistiigi tonistiigi self-assigned this Jan 13, 2020
Copy link
Member

@tonistiigi tonistiigi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

{"local op", Local("name")},
{"git op", Git("remote", "ref")},
{"http op", HTTP("url")},
{"file op", Scratch().File(Mkdir("foo", 0600).Mkfile("foo/bar", 0600, []byte("data")).Copy(Scratch(), "src", "dst"))},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a test that sets a platform to the original state and then checks it after NewState(defOp)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tonistiigi Can you clarify what you mean by original state?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The state that is marshaled and then passed to NewDefinitionOp

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a specific call to state.Platform() to verify as the platform is stored in the state directly, not only in the proto where digest comparison is enough.


dockerfile := []byte(`
FROM scratch
COPY foo /
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add some renames to these copy operation to force a change in every step and avoid potential false positive. eg COPY foo foo2 llb.Copy(llb.NewState(ref), "foo2", "foo3")

require.Equal(t, dt, []byte("data"))
}

func testFrontendUseForwardedSolveResults(t *testing.T, sb integration.Sandbox) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this test should be inside frontend/dockerfile pkg?

@tonistiigi
Copy link
Member

@hinshun needs rebase

@hinshun hinshun force-pushed the llbstate-from-result branch from 28641dd to 584fb0f Compare January 15, 2020 17:30
@hinshun hinshun requested a review from tonistiigi January 15, 2020 21:36
Copy link
Member

@tonistiigi tonistiigi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need a new cap for this, right? I assume NewState already errors on old daemons.

return nil, errors.Errorf("reference ids and definitions mismatch length")
}

dop, err := llb.NewDefinitionOp(ref.Defs[0])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do this [0] selectors need to be protected to avoid panic (at least on old/misbehaving daemons)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll protect them. 👍

Signed-off-by: Edgar Lee <edgarl@netflix.com>
Signed-off-by: Edgar Lee <edgarl@netflix.com>
…ded solve test to dockerfile package

Signed-off-by: Edgar Lee <edgarl@netflix.com>
Signed-off-by: Edgar Lee <edgarl@netflix.com>
@hinshun
Copy link
Collaborator Author

hinshun commented Jan 23, 2020

I think we need a new cap for this, right? I assume NewState already errors on old daemons.

@tonistiigi Thinking about how this will bubble up as an error. Instead of Reference implementing llb.Output perhaps it should have a function instead?

type Reference interface {
	ToOutput() (llb.Output, error)
	// ...
}

@tonistiigi
Copy link
Member

@hinshun Maybe ToState() then for user-friendliness. You can get output from a state if there is a need.

…ut to method ToOutput()

Signed-off-by: Edgar Lee <edgarl@netflix.com>
@hinshun
Copy link
Collaborator Author

hinshun commented Jan 23, 2020

@tonistiigi Do you think we can preserve context values in (llb.State).ctx through Marshal/Unmarshalling? I was thinking we can either leverage (pb.OpMetadata).Description or add a new proto field.

Example:

fmt.Printf("env: %s\n", st.Env()) // => env: ["key=value"]

def, _ := st.Marshal()
res, _ := c.Solve(ctx, client.SolveRequest{Definition: def.ToPB()})
ref, _ := res.SingleRef()
st2, _ := ref.ToState()

fmt.Printf("env: %s\n", st2.Env()) // => env: ["key=value"]

@tonistiigi
Copy link
Member

You can't get that information from the current proto. Not sure if it would make sense to add it. There is no magic inside llb.Image as well so probably better to leave it out. Maybe if there is a special option to save data on marshaling you could get it back from the ref by a key.

@hinshun
Copy link
Collaborator Author

hinshun commented Jan 23, 2020

Okay, I'll leave it out then.

@hinshun hinshun requested a review from tonistiigi January 23, 2020 21:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Previous solve results should be referable as locals in gw API

3 participants