Symptom
When nebari-operator reconciles a NebariApp with auth.provider: keycloak + auth.provisionClient: true, it generates an Envoy Gateway SecurityPolicy with all four OIDC provider endpoints populated from the in-cluster Keycloak service URL:
spec:
oidc:
clientID: <client>
clientSecret: { ... }
provider:
authorizationEndpoint: http://keycloak-keycloakx-http.keycloak.svc.cluster.local:8080/realms/nebari/protocol/openid-connect/auth
endSessionEndpoint: http://keycloak-keycloakx-http.keycloak.svc.cluster.local:8080/realms/nebari/protocol/openid-connect/logout
issuer: http://keycloak-keycloakx-http.keycloak.svc.cluster.local:8080/realms/nebari
tokenEndpoint: http://keycloak-keycloakx-http.keycloak.svc.cluster.local:8080/realms/nebari/protocol/openid-connect/token
redirectURL: https://<app-host>/oauth2/callback
authorizationEndpoint and endSessionEndpoint are the URLs the browser is redirected to. The browser cannot resolve keycloak-keycloakx-http.keycloak.svc.cluster.local, so the entire OAuth2 login flow fails. Curl test:
$ curl -sS -i https://<app-host>/
HTTP/1.1 302 Found
location: http://keycloak-keycloakx-http.keycloak.svc.cluster.local:8080/realms/nebari/protocol/openid-connect/auth?client_id=...
tokenEndpoint and issuer are server-to-server (Envoy proxy talking to Keycloak in-cluster); the in-cluster URL is fine for those.
Expected
Operator already has both URLs in env:
KEYCLOAK_URL=http://keycloak-keycloakx-http.keycloak.svc.cluster.local:8080 (in-cluster)
KEYCLOAK_EXTERNAL_URL=https://keycloak.<base> (public)
The SecurityPolicy generation should use:
authorizationEndpoint and endSessionEndpoint: KEYCLOAK_EXTERNAL_URL (browser hits these)
tokenEndpoint and issuer: either KEYCLOAK_URL (faster, in-cluster) or KEYCLOAK_EXTERNAL_URL (consistent). Both work; KEYCLOAK_URL is preferred for latency and to avoid making the public DNS / cert chain a hard dependency for token exchange.
Reproduction
Clean NIC AWS deploy + a NebariApp with auth enabled. Reconcile completes, SecurityPolicy shows all four endpoints on the in-cluster URL. curl -i https://<app-host>/ returns 302 with the in-cluster URL in Location.
Where this surfaced
Validating the Nebari LLM Serving Pack fresh-install runbook (nebari-dev/nebari-llm-serving-pack#65) on a brand-new NIC + LLM-pack deploy. Blocks the user-facing key-manager UI flow end-to-end (the user is supposed to mint API keys from the UI; without working auth, no UI access).
Operator version
quay.io/nebari/nebari-operator:0.1.0-alpha.18
Suggested fix
In the SecurityPolicy templating code (wherever the four provider.* endpoints get rendered), split the URL source by endpoint:
authorizationEndpoint := externalURL + "/realms/" + realm + "/protocol/openid-connect/auth"
endSessionEndpoint := externalURL + "/realms/" + realm + "/protocol/openid-connect/logout"
issuer := internalURL + "/realms/" + realm
tokenEndpoint := internalURL + "/realms/" + realm + "/protocol/openid-connect/token"
If the codebase already has helpers for one URL, parameterize them with which env var to read. Existing KEYCLOAK_EXTERNAL_URL env should be the source for the two browser-facing endpoints; existing KEYCLOAK_URL for the two server-to-server endpoints.
Symptom
When
nebari-operatorreconciles aNebariAppwithauth.provider: keycloak+auth.provisionClient: true, it generates an Envoy GatewaySecurityPolicywith all four OIDCproviderendpoints populated from the in-cluster Keycloak service URL:authorizationEndpointandendSessionEndpointare the URLs the browser is redirected to. The browser cannot resolvekeycloak-keycloakx-http.keycloak.svc.cluster.local, so the entire OAuth2 login flow fails. Curl test:tokenEndpointandissuerare server-to-server (Envoy proxy talking to Keycloak in-cluster); the in-cluster URL is fine for those.Expected
Operator already has both URLs in env:
KEYCLOAK_URL=http://keycloak-keycloakx-http.keycloak.svc.cluster.local:8080(in-cluster)KEYCLOAK_EXTERNAL_URL=https://keycloak.<base>(public)The SecurityPolicy generation should use:
authorizationEndpointandendSessionEndpoint:KEYCLOAK_EXTERNAL_URL(browser hits these)tokenEndpointandissuer: eitherKEYCLOAK_URL(faster, in-cluster) orKEYCLOAK_EXTERNAL_URL(consistent). Both work;KEYCLOAK_URLis preferred for latency and to avoid making the public DNS / cert chain a hard dependency for token exchange.Reproduction
Clean NIC AWS deploy + a
NebariAppwith auth enabled. Reconcile completes,SecurityPolicyshows all four endpoints on the in-cluster URL.curl -i https://<app-host>/returns 302 with the in-cluster URL inLocation.Where this surfaced
Validating the Nebari LLM Serving Pack fresh-install runbook (
nebari-dev/nebari-llm-serving-pack#65) on a brand-new NIC + LLM-pack deploy. Blocks the user-facing key-manager UI flow end-to-end (the user is supposed to mint API keys from the UI; without working auth, no UI access).Operator version
quay.io/nebari/nebari-operator:0.1.0-alpha.18Suggested fix
In the SecurityPolicy templating code (wherever the four
provider.*endpoints get rendered), split the URL source by endpoint:If the codebase already has helpers for one URL, parameterize them with which env var to read. Existing
KEYCLOAK_EXTERNAL_URLenv should be the source for the two browser-facing endpoints; existingKEYCLOAK_URLfor the two server-to-server endpoints.