Client Compatible Bedrock ARN handling#62720
Conversation
chrsmith
left a comment
There was a problem hiding this comment.
@RXminuS , after our conversation this morning we don't want to merge these changes, right?
e.g. if we have the raw AWS ARN be put in the ChatModel configuration field, it causes clients to choke on the Mal-formed model.
So we wanted to instead go with a model where the model was specified as something like anthropic.v1/${awsArn}. Right? So we should close this out?
5757308 to
cbc0b77
Compare
cbc0b77 to
d3947d3
Compare
chrsmith
left a comment
There was a problem hiding this comment.
I see you made some changes, posting the review comments before diffing them becomes a pain..
There was a problem hiding this comment.
minor nit: While I don't think go fmt or any style guide will give you an exact max column length, I would suggest that you add a new line after 80, 100, or 120 columns. (Whatever seems most consistent with the rest of the surrounding code.)
The problem with having a single, really long comment like this is that it makes the code harder to read. (e.g. you need to stop and then scroll horizontally.) Whereas splitting at XX columns, means you can read the code top-down.
There was a problem hiding this comment.
Also, I think the comment would benefit from some examples since it isn't clear what form <model_id> should take.
e.g.
// Bedrock Model IDs can be in one of two forms:
// - A static model ID, e.g. "anthropic.claude-v2".
// - A model ID and ARN for provisioned capacity, e.g.
// "anthropic.claude-v2/arn:aws:bedrock:us-west-2:012345678901:provisioned-model/xxxxxxxx"
//
// See the AWS docs for more information:
// https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html
// https://docs.aws.amazon.com/bedrock/latest/APIReference/API_CreateProvisionedModelThroughput.htmlThere was a problem hiding this comment.
Can we add a quick note about what constitutes a valid BedrockModelID instance? Specifically, that in most cases, we expect the ProvisionedCapacity string to be empty.
As an optional suggestion, consider going a step further and making this explicit. (e.g. renaming the field to ProvisionedCapacityARN and changing its type to be *string.)
type BedrockModelIdParts struct {
// Model is the underlying LLM model Bedrock is serving, e.g. "anthropic.claude-3-haiku-20240307-v1:0"
Model string
// If the configuration is using provisioned capacity, this will
// contain the ARN of the model to use for making API calls.
// e.g. "anthropic.claude-v2/arn:aws:bedrock:us-west-2:012345678901:provisioned-model/xxxxxxxx"
ProvisionedCapacityARN *string
}There was a problem hiding this comment.
nit: Golang naming conventions would have you write this as ID (all caps).
| type BedrockModelIdParts struct { | |
| type BedrockModelIDParts struct { |
But I'll also say that something doesn't quite smell right, here. Since this data type isn't "parts" so much as just a way to describe an AWS Bedrock model. (And that just requires a ProvisionedCapacityARN in some situations.)
So what do you think about renaming this to just BedrockModel or BedrockModelRef or BedrockModelReference? "XXX ID Parts" just doesn't sound right, IMHO.
There was a problem hiding this comment.
As it turns out, strings.SplitN will return a slice with zero elements if given an empty string. So in some degenerate case this function will panic on line 116.
| parts := strings.SplitN(modelId, "/", 2) | |
| if modelId == "" { | |
| return BedrockModelIdParts{} | |
| } | |
| parts := strings.SplitN(modelId, "/", 2) |
https://go.dev/play/p/wlg3qDPUEmn
Also, re Golang naming conventions, it's more common to have the parameter name be modelID (capitalizing ID).
There was a problem hiding this comment.
I was unfamiliar with the concept of "casefold", which perhaps is more common in the Python world? For clarity, would it be more helpful to call this CanonicalizeBedrockModelID(string) string?
Or even change the signature to be:
func (bm BedrockModelId) CanonicalizedModel() string {That may make the code easier to read and follow. Also, since the function is exported from the package we may want to put a quick doc comment on it to explain what the function does and why it is needed. (e.g. when we would want to "casefold" a model ID.)
In case it comes up later, the Golang version of a case-insensitive comparison in case you missed it is called "EqualFold".
https://pkg.go.dev/strings#EqualFold
There was a problem hiding this comment.
| //check for bedrock ARNs | |
| // Check for bedrock ARNs, which if using provisioned capacity are in | |
| // an awkward format. |
There was a problem hiding this comment.
Minor nit: Since these aren't used anywhere else in the function, could we just move this code block to inside the if .Provider == AWSBedrock { block?
There was a problem hiding this comment.
Just for clarity, if you think this is unnecessary feel free to ignore.
| if strings.HasPrefix(modelId.value, "arn:aws:") { | |
| // When using provisioned capacity we expect an admin would just put the ARN | |
| // here directly, but we need both the model AND the ARN. Hence the check. | |
| if strings.HasPrefix(modelId.value, "arn:aws:") { |
There was a problem hiding this comment.
Can we add a positive test to, where the ChatModel isn't prefixed with are:aws?
Also, another test scenario that sets some other model like CompletionModel or FastChatModel? (i.e. just confirm that we look at more than just ChatModel.)
chrsmith
left a comment
There was a problem hiding this comment.
OK, finished the code review.
Overall, things look good. Functionality wise this seems to do what we need it to, so if your testing of how the backend supports these "provisioned capacity ARNs", then let's go ahead and get it checked-in.
But as far as code quality and/or Golang idioms, there are a few ways that we can improve this. Some recurring themes:
- Stylize
IDwith two capital letters. This is called out in Google's Golang style guide here - The "ID parts" data type seems a little off, e.g. conceptually it is a reference to an AWS Bedrock model. (Which may or may not contain an ARN for provisioned capacity.) So maybe we can make that field
*stringto make that super-clear?
Anyways, nothing structural. Just a lot of minor nitpicks. (And 🤞 we can make this problem go away entirely by having a more sophisticated way to refer to an LLM model in our config, in RFC 950.)
|
|
||
| if parsedModelId.ProvisionedCapacity != "" { | ||
| if stream { | ||
| apiURL.RawPath = fmt.Sprintf("/model/%s/invoke-with-response-stream", url.QueryEscape(parsedModelId.ProvisionedCapacity)) |
There was a problem hiding this comment.
Calling url.QueryEscape here is really surprising. IIRC, this is a quirk of the AWS API, and something we have to do. So perhaps that's subtle enough that we should call out with a quick comment? e.g.
// We need to Query escape the provisioned capacity ARN, since
// otherwise the AWS API <does thing that we don't like>.| @@ -853,9 +853,9 @@ func GetCompletionsConfig(siteConfig schema.SiteConfiguration) (c *conftypes.Com | |||
| } | |||
|
|
|||
| // Make sure models are always treated case-insensitive. | |||
There was a problem hiding this comment.
Can we update this comment to explain why we need to call CasefoldBedrockModelID and not just the "more obvious" strings.ToLower? e.g.
// Make sure models are always treated case-insensitive, however
// if the model provides a provisioned capacity ARN that needs
// to be kept as-is. (Since ARNs are case-sensitive.)| } | ||
|
|
||
| func anthropicDefaultMaxPromptTokens(model string) int { | ||
| //TODO: this doesn't nearly cover all the ways that token size can be specified. https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html |
There was a problem hiding this comment.
Ultra-pedantic nit pick around the syntax of a comment.
| //TODO: this doesn't nearly cover all the ways that token size can be specified. https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html | |
| // TODO: this doesn't nearly cover all the ways that token size can be specified. | |
| // See: https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html |
| conf.ContributeValidator(embeddingsConfigValidator) | ||
| } | ||
|
|
||
| const bedrockArnMessageTemplate = "completions.%s is invalid. Provisioned Capacity IDs must be formatted like \"model_id/provisioned_capacity_arn\".\nFor example \"anthropic.claude-instant-v1/%s\"" |
There was a problem hiding this comment.
In case you were unaware, in Golang you can use the %q format specifier to put double quotes around a string. So this arguably could be cleaned up by using that instead of \"...\".
fmt.Sprintf(
"completions.%s is invalid. Must be formatted like %q. For example %q.",
rawValue,
"model_id/provisioned_capacity_arn",
"anthropic.claude-instant-v1/" + rawValue)|
|
||
| const bedrockArnMessageTemplate = "completions.%s is invalid. Provisioned Capacity IDs must be formatted like \"model_id/provisioned_capacity_arn\".\nFor example \"anthropic.claude-instant-v1/%s\"" | ||
|
|
||
| type modelId struct { |
There was a problem hiding this comment.
Like throughout the PR, perhaps naming this modelID is the more idiomatic choice.
Also, because Golang is awesome like that, you can literally put a type definition within a function. So rather than introducing a package-level symbol here, you can define modelID inside the completionsConfigValidator function.
| // Bedrock Model IDs can be in one of two forms: | ||
| // - A static model ID, e.g. "anthropic.claude-v2". | ||
| // - A model ID and ARN for provisioned capacity, e.g. | ||
| // "anthropic.claude-v2/arn:aws:bedrock:us-west-2:012345678901:provisioned-model/xxxxxxxx" |
Check warning
Code scanning / Semgrep OSS
Semgrep Finding: security-semgrep-rules.semgrep-rules.secret.match-aws-arn
| Model string | ||
| // If the configuration is using provisioned capacity, this will | ||
| // contain the ARN of the model to use for making API calls. | ||
| // e.g. "anthropic.claude-v2/arn:aws:bedrock:us-west-2:012345678901:provisioned-model/xxxxxxxx" |
Check warning
Code scanning / Semgrep OSS
Semgrep Finding: security-semgrep-rules.semgrep-rules.secret.match-aws-arn
|
|
||
| // Check for bedrock Provisioned Capacity ARNs which should instead be | ||
| // formatted like: | ||
| // "anthropic.claude-v2/arn:aws:bedrock:us-west-2:012345678901:provisioned-model/xxxxxxxx" |
Check warning
Code scanning / Semgrep OSS
Semgrep Finding: security-semgrep-rules.semgrep-rules.secret.match-aws-arn
|
The backport to To backport this PR manually, you can either: Via the sg toolUse the sg backport -r 5.4.0 -p 62720Via your terminalTo backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-5.4.0 5.4.0
# Navigate to the new working tree
cd .worktrees/backport-5.4.0
# Create a new branch
git switch --create backport-62720-to-5.4.0
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 6a7666c42cf349a6b702b8c72c9cc2c4e8421ad7
# Push it to GitHub
git push --set-upstream origin backport-62720-to-5.4.0
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-5.4.0If you encouter conflict, first resolve the conflict and stage all files, then run the commands below: git cherry-pick --continue
# Push it to GitHub
git push --set-upstream origin backport-62720-to-5.4.0
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-5.4.0
Once the pull request has been created, please ensure the following:
|
* Improve Bedrock ARN handling * Fix up PR comments (cherry picked from commit 6a7666c)
Improve handling of Bedrock ARNs so that they no longer break client parsing of ModelIDs. This used to result in the CompletionProvider not initializing when configuring a ARN model ID.
This is now resolved by requiring a
<underlying_model_id>/<provisioned_capacity_arn>format which is correctly parsed by the client. This also has the benefit of providing some information about the underlying model behind the provisioned capacityTest plan