Skip to content

SecurityPolicy generated for NebariApp uses in-cluster Keycloak URL for browser-facing endpoints (authorizationEndpoint, endSessionEndpoint) #110

@dcmcand

Description

@dcmcand

Symptom

When nebari-operator reconciles a NebariApp with auth.provider: keycloak + auth.provisionClient: true, it generates an Envoy Gateway SecurityPolicy with all four OIDC provider endpoints populated from the in-cluster Keycloak service URL:

spec:
  oidc:
    clientID: <client>
    clientSecret: { ... }
    provider:
      authorizationEndpoint: http://keycloak-keycloakx-http.keycloak.svc.cluster.local:8080/realms/nebari/protocol/openid-connect/auth
      endSessionEndpoint:    http://keycloak-keycloakx-http.keycloak.svc.cluster.local:8080/realms/nebari/protocol/openid-connect/logout
      issuer:                http://keycloak-keycloakx-http.keycloak.svc.cluster.local:8080/realms/nebari
      tokenEndpoint:         http://keycloak-keycloakx-http.keycloak.svc.cluster.local:8080/realms/nebari/protocol/openid-connect/token
    redirectURL: https://<app-host>/oauth2/callback

authorizationEndpoint and endSessionEndpoint are the URLs the browser is redirected to. The browser cannot resolve keycloak-keycloakx-http.keycloak.svc.cluster.local, so the entire OAuth2 login flow fails. Curl test:

$ curl -sS -i https://<app-host>/
HTTP/1.1 302 Found
location: http://keycloak-keycloakx-http.keycloak.svc.cluster.local:8080/realms/nebari/protocol/openid-connect/auth?client_id=...

tokenEndpoint and issuer are server-to-server (Envoy proxy talking to Keycloak in-cluster); the in-cluster URL is fine for those.

Expected

Operator already has both URLs in env:

  • KEYCLOAK_URL=http://keycloak-keycloakx-http.keycloak.svc.cluster.local:8080 (in-cluster)
  • KEYCLOAK_EXTERNAL_URL=https://keycloak.<base> (public)

The SecurityPolicy generation should use:

  • authorizationEndpoint and endSessionEndpoint: KEYCLOAK_EXTERNAL_URL (browser hits these)
  • tokenEndpoint and issuer: either KEYCLOAK_URL (faster, in-cluster) or KEYCLOAK_EXTERNAL_URL (consistent). Both work; KEYCLOAK_URL is preferred for latency and to avoid making the public DNS / cert chain a hard dependency for token exchange.

Reproduction

Clean NIC AWS deploy + a NebariApp with auth enabled. Reconcile completes, SecurityPolicy shows all four endpoints on the in-cluster URL. curl -i https://<app-host>/ returns 302 with the in-cluster URL in Location.

Where this surfaced

Validating the Nebari LLM Serving Pack fresh-install runbook (nebari-dev/nebari-llm-serving-pack#65) on a brand-new NIC + LLM-pack deploy. Blocks the user-facing key-manager UI flow end-to-end (the user is supposed to mint API keys from the UI; without working auth, no UI access).

Operator version

quay.io/nebari/nebari-operator:0.1.0-alpha.18

Suggested fix

In the SecurityPolicy templating code (wherever the four provider.* endpoints get rendered), split the URL source by endpoint:

authorizationEndpoint := externalURL + "/realms/" + realm + "/protocol/openid-connect/auth"
endSessionEndpoint    := externalURL + "/realms/" + realm + "/protocol/openid-connect/logout"
issuer                := internalURL + "/realms/" + realm
tokenEndpoint         := internalURL + "/realms/" + realm + "/protocol/openid-connect/token"

If the codebase already has helpers for one URL, parameterize them with which env var to read. Existing KEYCLOAK_EXTERNAL_URL env should be the source for the two browser-facing endpoints; existing KEYCLOAK_URL for the two server-to-server endpoints.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions