Skip to content

Fix WaveClient sending Bearer token to public S3 URLs#6672

Merged
pditommaso merged 1 commit intomasterfrom
fix/wave-bearer-token-s3
Dec 19, 2025
Merged

Fix WaveClient sending Bearer token to public S3 URLs#6672
pditommaso merged 1 commit intomasterfrom
fix/wave-bearer-token-s3

Conversation

@pditommaso
Copy link
Member

Summary

  • Fix WaveClient sending Bearer token to external URLs (e.g., public S3 buckets) when fetching container configs
  • Add dedicated plainHttpClient() without authentication for external resource requests
  • S3 and similar services reject requests with unsupported Authorization headers

Changes

  • Added plainHttpClient() factory method with @Memoized annotation for lazy initialization
  • Updated fetchContainerConfig() to use the plain client instead of the authenticated client
  • Added Javadoc to explain the rationale for each HTTP client
  • Added unit test that verifies container config fetch works without Bearer token

Fixes #6671

Test plan

  • Unit test added that simulates S3 behavior (rejects Bearer tokens)
  • All existing nf-wave tests pass

🤖 Generated with Claude Code

When fetching container configs from external URLs (e.g., public S3 buckets),
the WaveClient was sending the Tower/Platform Bearer token with the request.
AWS S3 does not support Bearer authentication and returns a 400 error:
"Unsupported Authorization Type".

This fix adds a dedicated HTTP client (plainHttpClient) without Bearer token
authentication for fetching external container configs, while the main client
continues to use authentication for Wave/Tower API calls.

Fixes #6671

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
@netlify
Copy link

netlify bot commented Dec 18, 2025

Deploy Preview for nextflow-docs-staging canceled.

Name Link
🔨 Latest commit 1103109
🔍 Latest deploy log https://app.netlify.com/projects/nextflow-docs-staging/deploys/6943c388a161f40008a8912b

@pditommaso pditommaso requested a review from jorgee December 18, 2025 09:04
@pditommaso
Copy link
Member Author

@jorgee Good question. After reviewing both codebases, here's what I found:

The Bearer token IS used by Wave backend, but for a different purpose than fetching the container config URL.

How the token flow works:

  1. Nextflow WaveClient sends the Tower access token to Wave API endpoints (like /v1alpha2/container)

  2. Wave backend validates it against Tower/Platform - it forwards the token to the Tower /user-info endpoint to verify it's valid and retrieve the user identity (UserServiceImpl.getUserByAccessToken)

  3. What the token enables: Once validated, the PlatformId (user info, workspace, token) is used for:

    • Fetching registry credentials from Tower
    • Accessing workflow launch info
    • Rate limiting and metrics tied to user identity

Why plainHttpClient() is correct here:

The containerConfig URL returned by Wave points to an external resource (like a public S3 bucket), not to the Wave API itself. This URL doesn't need/accept the Tower Bearer token - S3 uses different auth mechanisms (signed URLs, IAM, etc.).

The Wave API request that returns this URL already used the authenticated client. The subsequent fetch of the config content from the external URL should be unauthenticated.

So to answer your question: No, the token is not required for fetching the container config content from external URLs. The token is only needed for Wave API calls, which still use httpClient() with authentication.

protected HxClient plainHttpClient() {
return HxClient.newBuilder()
.httpClient(newHttpClient0())
.retryConfig(config.retryOpts())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.retryConfig(config.retryOpts())
.retryConfig(config.retryOpts())
.followRedirects(HttpClient.Redirect.NORMAL)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quite sure this is the default

final resp = plainHttpClient().sendAsString(req)
final code = resp.statusCode()
final body = resp.body()
if( code>=200 && code<400 ) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if( code>=200 && code<400 ) {
if( code>=200 && code<300 ) {

Copy link
Contributor

@jorgee jorgee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was testing the provided URL and it was answering with 301 it was considered a valid status code and then there was a failure because it was expecting a JSON but S3 respond with an XML

Caused by:
  Expected BEGIN_OBJECT but was STRING at line 1 column 1 path $
  See https://github.com/google/gson/blob/main/Troubleshooting.md#unexpected-json-structure

As it is not doing a Wave client request, I suggest to do the following changes:
1 - configure the plainHttpClient with followRedirection(NORMAL)
2 - manage redirection status code as error to avoid the JSON error problem.

Not sure for (1), but at least we should do (2)

@pditommaso
Copy link
Member Author

The redirect is handled by the http client by default. Just tried my branch and it's ok

@jorgee
Copy link
Contributor

jorgee commented Dec 18, 2025

Maybe I get confused, but HxClient is not HxClient a wrapper of HttpClient. In that libray, the default for followredirects is NEVER https://docs.oracle.com/en/java/javase/11/docs/api/java.net.http/java/net/http/HttpClient.html#followRedirects()

But, either NORMAL is the default or not, if the redirect has been followed, we shouldn't receive 3XX. So I think 3XX should be considered an error. This is the reason why I was getting the JSON parsing error when trying with "fusion.containerConfigUrl = 'https://s3.eu-west-2.amazonaws.com/bucket/path/fusion-amd64.json'

@pditommaso
Copy link
Member Author

So why (integration) tests pass ?

@jorgee
Copy link
Contributor

jorgee commented Dec 18, 2025

Ok, I see what is happening. The URL in the issue is fake. As it is a regional S3, the AWS is returning 301 to redirect to global endpoint. It is what produces the error in my case. With a correct URL, it is working.

@pditommaso
Copy link
Member Author

I assume this is ok in the correct form then

@jorgee
Copy link
Contributor

jorgee commented Dec 19, 2025

Yes

@pditommaso pditommaso merged commit ffaef0b into master Dec 19, 2025
36 checks passed
@pditommaso pditommaso deleted the fix/wave-bearer-token-s3 branch December 19, 2025 11:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

WaveClient sends Bearer token to public S3 URLs, causing authentication failures

3 participants