Skip to content

Add an additional_info_url fields to ExecuteResponse.#36

Merged
buchgr merged 1 commit intobazelbuild:masterfrom
EdSchouten:additional-info-urls
Nov 22, 2018
Merged

Add an additional_info_url fields to ExecuteResponse.#36
buchgr merged 1 commit intobazelbuild:masterfrom
EdSchouten:additional-info-urls

Conversation

@EdSchouten
Copy link
Copy Markdown
Collaborator

As mentioned in bazelbuild/bazel#6591, Bazel
Buildbarn now ships with a web service that can be used to explore data
stored in the CAS and AC.

By adding an extra field to ExecuteResponse, we can let Bazel
automatically report URLs to the user pointing to the correct page.
These URLs could, for example, be printed for build steps that fail.

@googlebot googlebot added the cla: yes Pull requests whose authors are covered by a CLA with Google. label Nov 5, 2018

// Similar to `server_logs`, an optional list of URLs the user may visit to
// obtain more information about the execution of the action.
repeated string additional_info_url = 5;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why repeated? Do you have a use case for this already?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can only think this is due to logs from the worker side and logs from the execution side?

Copy link
Copy Markdown
Collaborator Author

@EdSchouten EdSchouten Nov 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a very good question. I have to say I didn't give this a lot of thought. "Why not?" must have been my reasoning.

Right now it's just the Bazel Buildbarn Browser to which we link. That said, there is a fair chance the scheduler processes will at some point gain a status page with info on workers, etc. Maybe there is a use for emitting links to that as well?

That's all speculating. If there's a preference for a non-repeated field (or the other extreme: a string keyed map, just like the server logs), I can change this. :-)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The more generic version of the additional_info_url field would be a

string message = 5;

with the expectation being that a server can put an arbitrary message that a client is expected to display to the user. I do prefer the additional_info_url field because it's simple, no surprises and allows a client to have more control over how it presents the information to the user.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now it's just the Bazel Buildbarn Browser to which we link. That said, there is a fair chance the scheduler processes will at some point gain a status page with info on workers, etc. Maybe there is a use for emitting links to that as well?

Wouldn't this require either a key or a description for each URL to make this usable? A list of URLs without any metadata doesn't sound user-friendly.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, exactly. Let's switch this to a map then.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now it's just the Bazel Buildbarn Browser to which we link. That said, there is a fair chance the scheduler processes will at some point gain a status page with info on workers, etc. Maybe there is a use for emitting links to that as well?

In that case couldn't you still only post a single link that takes the user to a web interface and link to other things from there? One link seems simple and powerful enough to fulfill all of these use cases.

@buchgr buchgr requested review from aiuto and ola-rozenfeld November 6, 2018 08:50
@finnball
Copy link
Copy Markdown

finnball commented Nov 6, 2018

Having read the original work this is based on, this is a nice proposal and I really like the website style that produces easily readable information for users.

Though I'm unsure if support for URLs with additional information really fits into the Remote Execution API here.

Perhaps we could expand the current ExecutionMetadata here. Potentially, as you pointed out, including come kind of map. I already find just including the stdout_stream_name to be limiting in that it doesn't provide the ByteStream URL.

The only problem with my suggestion is through duplication of Actions. The API specifies that this be: // The details of the execution that originally produced this result. That's the ExecutedActionMetadata. I knew I was going to make this mistake at some point!

I still feel like this is more of a metadata result and very much optional which is what makes me feel uncomfortable having it written in the current form it is.

@buchgr buchgr removed the request for review from aiuto November 6, 2018 09:26
@buchgr
Copy link
Copy Markdown
Contributor

buchgr commented Nov 6, 2018

I'm unsure if support for URLs with additional information really fits into the Remote Execution API. It may be an idea to have this as some kind of Any field whereby you can pack in your own implementation specific protos.

Do you have specific concerns? I don't like Any in that it hurts cross compatibility between clients and services - it's poking a hole into a well specified protocol.

@finnball
Copy link
Copy Markdown

finnball commented Nov 6, 2018

I'm unsure if support for URLs with additional information really fits into the Remote Execution API. It may be an idea to have this as some kind of Any field whereby you can pack in your own implementation specific protos.

Do you have specific concerns? I don't like Any in that it hurts cross compatibility between clients and services - it's poking a hole into a well specified protocol.

Apologies, I amended my answer as you posted after realising it wasn't a sensible suggestion. For me, this is more of a metadata result than something essential to the ExecutionResponse.

@buchgr
Copy link
Copy Markdown
Contributor

buchgr commented Nov 6, 2018

My rational for including it in the ExecuteRequest would be that this URL fulfills the same purpose as server_logs, except that a URL can point to a rich (web) interface that can do a better job displaying information and potentially offer additional functionality. I think both server_logs and additional_info_url do make sense and can be useful. I think the human_readable field in LogFile will turn out to have been a bad idea, as it seems not realistic for a client to display arbitrary, server defined log files in it's UI.

I would argue against ExecuteOperationMetadata as it's not clear yet whether this whole message will turn out to be useful. It seems a bit complicated. We don't use it in Bazel and have no intention of doing so.

@finnball
Copy link
Copy Markdown

finnball commented Nov 6, 2018

My rational for including it in the ExecuteRequest would be that this URL fulfills the same purpose as server_logs, except that a URL can point to a rich (web) interface that can do a better job displaying information and potentially offer additional functionality. I think both server_logs and additional_info_url do make sense and can be useful. I think the human_readable field in LogFile will turn out to have been a bad idea, as it seems not realistic for a client to display arbitrary, server defined log files in it's UI.

s/ExecuteRequest/ExecuteResponse :)

I have no problem with including this extra information and I rather like the idea. I just question if we should be shoving this field into the ExecuteResponse as it currently stands.

So if this is a type of log as you say, would it be better to refactor the message LogFile than adding another field whilst keeping a field that as you suggest, could be improved?

I would argue against ExecuteOperationMetadata as it's not clear yet whether this whole message will turn out to be useful. It seems a bit complicated. We don't use it in Bazel and have no intention of doing so.

I find this field useful for builds that take a long time. I like how it allows the user to optionally stream logs back to the client and lets them know the status of the build. There are certain limitations about it which annoy me and I do question if we should be using the long-running-api or a modified REAPI version instead though that's for a separate discussion.

I've also made a mistake in my above comment, when I spoke about // The details of the execution that originally produced this result. . This is for the ExecutedActionMetadata and not ExecuteOperationMetadata. So we could actually move this field into the metadata if we wanted.

I would therefore argue that this field somehow be included as an improved logging message or in a metadata field.

@EdSchouten
Copy link
Copy Markdown
Collaborator Author

EdSchouten commented Nov 6, 2018

Moving it into the ExecutedActionMetadata would be a bit odd in Buildbarn's use case specifically. ExecutedActionMetadata is stored within ActionResult. The page that I've made for Buildbarn is used to inspect ActionResult objects specifically. Upon successful builds, ActionResults are normally stored in the AC. Because we care about the failures in particular, we store them in a special message in the CAS to not interfere:

https://github.com/EdSchouten/bazel-buildbarn/blob/master/pkg/proto/failure/failure.proto

The URL that we return to the user is not much more than a digest of this message stored in the CAS. This is where it becomes problematic: by basing the URL on the digest of this message and storing that URL within the message, we introduce a cycle. This cycle could be avoided by only adding the URL to the ActionResult to the message returned to Bazel and omitting it from the ActionResult that is stored, but that feels sloppy.

Another thing that is worrisome about storing this underneath ActionResult is that it now also ends up in the AC. This means that an administrator of a build cluster has to purge the entire AC in case the URLs change (e.g., switching hostnames, upgrading to a newer version of Buildbarn Browser that has a different URL scheme).

@finnball
Copy link
Copy Markdown

finnball commented Nov 6, 2018

Moving it into the ExecutedActionMetadata would be a bit odd in Buildbarn's use case specifically. ExecutedActionMetadata is stored within ActionResult. The page that I've made for Buildbarn is used to inspect ActionResult objects specifically. Upon success, ActionResults are normally stored in the AC. Because we care about the failures in particular, we store them in a special message in the CAS:

https://github.com/EdSchouten/bazel-buildbarn/blob/master/pkg/proto/failure/failure.proto

The URL that we return to the user is not much more than a digest of this message stored in the CAS. This is where it becomes problematic: by basing the URL on the digest of this message and storing that URL within the message, we introduce a cycle. This cycle could be avoided by only adding the URL to the ActionResult to the message returned to Bazel and omitting it from the ActionResult that is stored, but that feels sloppy.

Another thing that is worrisome about storing this underneath ActionResult is that it now also ends up in the AC. This means that an administrator of a build cluster has to purge the entire AC in case the URLs change (e.g., switching hostnames, upgrading to a newer version of Buildbarn that has a different URL scheme).

Agreed, I don't think this should be in the ExecutedActionMetadata.

To be clear, I was suggesting we store these keys in ExecuteOperationMetadata, not ExecutedActionMetadata. Or even some other metadata field yet to be defined.

@EdSchouten
Copy link
Copy Markdown
Collaborator Author

Ah, gotcha. So it's either ExecuteOperationMetadata or ExecuteResponse. I think deciding between these two should be done by looking at the behaviour of Bazel.

For a single ExecuteRequest, Bazel fires off one or more operations. Operations may fail or disappear for many odd reasons (e.g., workers dying). This causes Bazel to create brand new operations for the same ExecuteRequest until it ends up receiving an ExecuteResponse. For each of these operations, ExecuteOperationMetadata is permitted to contain data that is different (e.g., different std{out,err}_stream_names).

This means that if these URLs are stored in the ExecuteOperationMetadata, Bazel might end up with a pile of different URLs to display for a single build step. Do we want Bazel to potentially print multiple similar URLs for every build step? If so, the URLs should go in ExecuteOperationMetadata. If not, they should likely go in ExecuteResponse.

For Buildbarn Browser, the latter is sufficient.

@finnball
Copy link
Copy Markdown

finnball commented Nov 6, 2018

Ah, gotcha. So it's either ExecuteOperationMetadata or ExecuteResponse. I think deciding between these two should be done by looking at the behaviour of Bazel.

Bazel isn't the only client here. A better approach might be to look at what the API specifies. Currently when sending an ExecuteRequest, you receive a longrunning.Operation as shown here with a metadata field described here. If this needs changing then we should start a discussion.

For a single ExecuteRequest, Bazel fires off one or more operations. Operations may fail or disappear for many odd reasons (e.g., workers dying). This causes Bazel to create brand new operations for the same ExecuteRequest until it ends up receiving an ExecuteResponse. For each of these operations, ExecuteOperationMetadata is permitted to contain data that is different (e.g., different std{out,err}_stream_names).

In this rare scenario, if a worker dies, could the server not retry the job internally, keeping the same operation_id as before? Or could the client just retry the job and use the new operation id it is given?

This means that if these URLs are stored in the ExecuteOperationMetadata, Bazel might end up with a pile of different URLs to display for a single build step. Do we want Bazel to potentially print multiple similar URLs for every build step? If so, the URLs should go in ExecuteOperationMetadata. If not, they should likely go in ExecuteResponse.

For Buildbarn Browser, the latter is sufficient.

For jobs that take a long time, I wouldn't mind clicking on a URL to see what was happening. This URL could be retrieved from the metadata streamed from the first request you make. If you want to retrieve it from the last, you could still retrieve it from the metadata you receive from the final response you get since the ExecuteResponse is supposed to be in the response field of the long-running-api.

It is a little subjective, though I feel that this data is metadata and should be made available from the first call and not just the last.

@bergsieker
Copy link
Copy Markdown
Contributor

Coming late to this very active party. :) I think that Finn has things roughly correct: the ExecuteResponse is intended to capture "end state" metadata for completed Actions, while the ExecuteOperationMetadata is intended to capture "status update" metadata for ongoing Actions. I think there are valid use cases for each type of metadata (and some data belongs in multiple places, e.g., stdout/stderr is provided as a stream handle in ExecuteOperationMetadata, but also as a digest in ActionResult). So, perhaps we should consider designating a place for this type of metadata in both ExecuteOperationMetadata and ExecuteResponse.

Note that Bazel doesn't currently use ExecuteOperationMetadata, and that for Bazel, which is generally used to dispatch 100s or 1000s or actions, printing metadata for each ongoing action probably isn't the right thing to do. Other clients may have different usage patterns, though, so printing per-action metadata for ongoing actions may make more sense.

In my opinion, the bigger question is how to capture and represent this data. I'd suggest that server_logs may be the right place after all, but that LogFile may need to be extended to support richer processing. For example, maybe it should have either a digest or embedded text. I think it should also specify some degree of handling preferences--something like "always print," "print on failure," or "never print"--as a hint to the client.

@buchgr
Copy link
Copy Markdown
Contributor

buchgr commented Nov 12, 2018

Note that Bazel doesn't currently use ExecuteOperationMetadata, and that for Bazel, which is generally used to dispatch 100s or 1000s or actions, printing metadata for each ongoing action probably isn't the right thing to do. Other clients may have different usage patterns, though, so printing per-action metadata for ongoing actions may make more sense.

YAGNI. If there is no current use case I'd argue we should revisit this once there is.

I'd suggest that server_logs may be the right place after all, but that LogFile may need to be extended to support richer processing. For example, maybe it should have either a digest or embedded text.

Works for me.

I think it should also specify some degree of handling preferences--something like "always print," "print on failure," or "never print"--as a hint to the client.

YAGNI.

@finnball
Copy link
Copy Markdown

finnball commented Nov 12, 2018

Thankyou for you responses :)

I'd suggest that server_logs may be the right place after all, but that LogFile may need to be extended to support richer processing. For example, maybe it should have either a digest or embedded text. I think it should also specify some degree of handling preferences--something like "always print," "print on failure," or "never print"--as a hint to the client.

Adding or amending to the LogFile map feels to me like a bandaid over a field which isn't that useful in its current form anyway. We're now going to be in a situation where we have logs contained in the result and logs in the metadata.

for this type of metadata in both ExecuteOperationMetadata and ExecuteResponse.

I still feel that we are better off having a single metadata field instead of repeating ourselves. Bazel surely still uses the long-running-api to some extent in that it needs to look into the operation.result field for the ExecuteResponse as stated here.

If you take a look at what is returned from an ExecuteRequest, this format to me mostly seems sensible. It includes the result, the done flag and importantly a metadata field. Surely Bazel could just check that done flag, if true, then scrape any optional metadata from the metadata field? This allows us for example, to pass any meaningful links to the user from the first request, allowing the client to dictate if it makes this visible from the start or not, instead of allowing specific implementations to dictate the API.

@buchgr
Copy link
Copy Markdown
Contributor

buchgr commented Nov 13, 2018

@finnball are you suggesting of moving the server_logs field to the ExecuteOperationMetadata? I agree that this information should be contained in either one or the other but not both.

@finnball
Copy link
Copy Markdown

Unless there's a good reason not to, I think it's better to move the mapping discussed for this MR to the ExecuteOpertationMetadata as it gives us more flexibility for when the client chooses to display this data to the user.

I'm also suggesting in the future and in a separate MR, we could look at removing the LogField and replace it with something more generic in the ExecutionoperationMetadata that would also encompass the stdout_stream_name, as it seems a little odd that we have two potential places for logging :)

@ola-rozenfeld
Copy link
Copy Markdown
Contributor

Sorry for being even later to the party!

Agree with Steven on the difference between the metadata and the response -- ExecuteOperationMetadata describes an in-progress operation, with fields such as stream handles for the outputs (and maybe logs, if we need them), while ExecuteResponse describes the completed action, with all the outputs already digested, and is stored persistently.

I see a use-case for both here -- if the URL can be used to view the progress of an already executing action, or to ssh to the worker, or things like that -- these items belong in ExecuteOperationMetadata. But if the URL is used for debugging a failed action, then it belongs in the ExecuteResponse. Re: loop -- yes, the response digest cannot be part of the key in this case, it would have to be action-id+timestamp or some such.

Btw, disagree with Jakob on ExecuteOperationMetadata -- it is the only way to get streamed logs, and we will definitely need to use it in Bazel if we want --test_output=streamed to work as intended with RE.

As to the field names: I am ambivalent about assuming that a server will always communicate in urls, I prefer the message field which clients can display to the user, it's just more generic. But I'm not opposed to a bunch of urls either.

@buchgr
Copy link
Copy Markdown
Contributor

buchgr commented Nov 15, 2018

In order to make progress on solving this particular use case of displaying a url in case of a failed action. Would anyone strongly disagree with adding a simple string message field to ExecuteResponse whose intend is to be displayed to and read by a human?

For the other points raised here, let's talk about them in separate issues? I am happy to create the issues and start the discussion.

@EdSchouten
Copy link
Copy Markdown
Collaborator Author

Sounds good to me!

This field may be used by a remote builder to inject additional lines of
text into the output of the build client. Buildbarn may use this to
print links to Buildbarn Browser, so that users can inspect build
failures in more detail.
@bergsieker
Copy link
Copy Markdown
Contributor

SGTM

1 similar comment
@ola-rozenfeld
Copy link
Copy Markdown
Contributor

SGTM

@buchgr
Copy link
Copy Markdown
Contributor

buchgr commented Nov 16, 2018

paging @finnball :-)

@finnball
Copy link
Copy Markdown

Yup, fine by me. Thankyou.

@buchgr buchgr merged commit ed48498 into bazelbuild:master Nov 22, 2018
henryaj pushed a commit to henryaj/buildgrid that referenced this pull request Nov 20, 2019
Noteworthy changes include:
* From remote_execution.proto:
  - New message field in ExecuteResponse [1].
  - Output file and dir. parent folders clarifications [2].
  - Better doc. for do_not_cache/skip_cache_lookup [3].
  - Documention updates.
* From semver.proto:
  - Documentation updates.

[1] bazelbuild/remote-apis#36
[2] bazelbuild/remote-apis#42
[3] bazelbuild/remote-apis#55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla: yes Pull requests whose authors are covered by a CLA with Google.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants