Skip to content

NVLOG to prevent unused var#1758

Merged
mattklein123 merged 5 commits intomasterfrom
fix-clang-compile-nvlog
Sep 27, 2017
Merged

NVLOG to prevent unused var#1758
mattklein123 merged 5 commits intomasterfrom
fix-clang-compile-nvlog

Conversation

@danielhochman
Copy link
Copy Markdown
Contributor

clang 5 fails when NVLOG set because resource is unused

Signed-off-by: Daniel Hochman dhochman@lyft.com

Signed-off-by: Daniel Hochman <dhochman@lyft.com>
@danielhochman danielhochman requested a review from htuch September 27, 2017 19:28
htuch
htuch previously approved these changes Sep 27, 2017
Copy link
Copy Markdown
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is OK for now. I think it would be good to factor out the "dump a string representation of everything in a vector/RepeatedPtrField", so maybe just add a TODO for this. Did we just hit this because of a silent upgrade in the Docker image?

@danielhochman
Copy link
Copy Markdown
Contributor Author

@htuch I noticed this in my local env. We don't use NVLOG anywhere in testing. On our release build we use NVLOG but with gcc and not clang.

Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good other than @htuch TODO comment. Please merge master also to fix CI.

version_info_ = version_info;
ENVOY_LOG(debug, "gRPC config for {} accepted with {} resources", type_url_, resources.size());

#ifndef NVLOG
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: outdent ifdef

Signed-off-by: Daniel Hochman <dhochman@lyft.com>
Daniel Hochman added 2 commits September 27, 2017 13:01
Signed-off-by: Daniel Hochman <dhochman@lyft.com>
…ompile-nvlog

Signed-off-by: Daniel Hochman <dhochman@lyft.com>
#define ENVOY_STREAM_LOG(LEVEL, FORMAT, STREAM, ...) \
ENVOY_STREAM_LOG_TO_LOGGER(ENVOY_LOGGER(), LEVEL, FORMAT, STREAM, ##__VA_ARGS__)

// TODO(danielhochman): macro(s) for simple logging of structures that support iteration
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: period end of sentence, "macro(s)/function(s)"

Signed-off-by: Daniel Hochman <dhochman@lyft.com>
@mattklein123 mattklein123 merged commit 0a35d49 into master Sep 27, 2017
@mattklein123 mattklein123 deleted the fix-clang-compile-nvlog branch September 27, 2017 21:08
costinm pushed a commit to costinm/envoy that referenced this pull request Oct 2, 2017
Signed-off-by: Daniel Hochman <dhochman@lyft.com>
mathetake pushed a commit that referenced this pull request Mar 3, 2026
…1758)

**Description**
Implements redaction of sensitive information in debug logs to prevent
exposure of user prompts, API keys, AI-generated content. The redaction
system preserves json-structure for requests/responses and includes
length/hash metadata for debugging purposes.

**Problem**

Debug logging currently exposes sensitive information including:

- User prompts and messages (text, images, audio, files)
- AI-generated responses and tool call arguments
- API keys and authorization tokens in headers
- Tool definitions, function schemas, and response format schemas
- Prediction content and guided JSON schemas

**Solution**

**Request Body Redaction**

Added `RedactSensitiveInfoFromRequest` interface method to
`EndpointSpec` with full implementation for chat completions endpoint
(placeholders for others):

**Redacted Content:**

- **Messages**: All message types (user, assistant, system, developer,
tool)
  - Text content with length and hash placeholders
  - Images (URLs and base64 data)
  - Audio data (base64-encoded)
  - File attachments
- **Tool Definitions**: Function descriptions and parameter schemas
- **Tool Call Arguments**: Function names and arguments in assistant
messages
- **Response Formats**: JSON schema names, descriptions, and schema
definitions
- **Guided JSON**: Raw JSON schema content
- **Prediction Content**: Cached prompts and prefill content

**Redaction Format:**

```
[REDACTED LENGTH=142 HASH=a3f5e8c2]
```

The hash (SHA256) enables:

- Correlating identical content across requests for cache debugging
- Identifying duplicate requests without content exposure
- Matching redacted logs to specific content patterns

**Example**

**Before** (debug log exposes sensitive content):

```shell
time=2026-01-12T11:39:01.848-05:00 level=DEBUG msg="request body processing" request_id=suku-234 is_upstream_filter=false request="request_body:{body:\"{\\n  \\\"model\\\": \\\"gcp.gemini-2.5-flash\\\",\\n  \\\"messages\\\": [\\n    {\\n      \\\"role\\\": \\\"user\\\",\\n      \\\"content\\\": \\\"What is capital of France?\\\"\\n    }\\n  ],\\n  \\\"max_completion_tokens\\\": 101,\\n  \\\"stream\\\": false\\n}\" end_of_stream:true} metadata_context:{}"
```

**After** (redacted with debugging metadata):

```shell
time=2026-01-12T16:36:35.689-05:00 level=DEBUG msg="request body processing" request_id=suku-234 is_upstream_filter=false request="{\"messages\":[{\"content\":\"[REDACTED LENGTH=26 HASH=e3b235f0]\",\"role\":\"user\"}],\"model\":\"gcp.gemini-2.5-flash\",\"max_completion_tokens\":101}"
```

## Special notes for reviewers
- Declaration (as per point 4 [1] ): AI tool was used to generate part
of the code in this PR.

Open to suggestion from community :) 

1:
https://github.com/envoyproxy/ai-gateway/blob/953951fb5c9cafc7e1a8747c64b13cff291fb1ce/CONTRIBUTING.md#what-is-allowed

---------

Signed-off-by: Sukumar Gaonkar <sgaonkar4@bloomberg.net>
Signed-off-by: Dan Sun <dsun20@bloomberg.net>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>
mathetake pushed a commit that referenced this pull request Mar 3, 2026
**Description**

## Fix panic when logging malformed request responses

### Problem

The `ai-gateway-extproc` container crashes with a panic when it receives
malformed JSON requests:

panic: interface conversion: ext_procv3.isProcessingResponse_Response is
  *ext_procv3.ProcessingResponse_ImmediateResponse, not
  *ext_procv3.ProcessingResponse_RequestBody

**Impact**: Every malformed request causes the container to restart,
leading to:
  - Service disruption (pods repeatedly crash)
  - Lost request processing capacity
  - Kubernetes backoff delays before recovery

  **Observed behavior in production**:
  RESTARTS       AGE
  14 (35m ago)   24h
  13 (36m ago)   24h
  10 (39m ago)   24h

  ### Root Cause

When `ProcessRequestBody` receives invalid JSON, it correctly returns a
user-facing error as an `ImmediateResponse`:

  ```go
  // processor_impl.go:207-210
if userFacingErr := internalapi.GetUserFacingError(err); userFacingErr
!= nil {
r.logger.Info("returning user-facing error for malformed request", ...)
      return createUserFacingErrorResponse(400, "BadRequest", ...), nil
  }
```

  However, the debug logging code at server.go:307 assumes the response is always a RequestBody and performs an unsafe type assertion:

  // server.go:307 - PANICS HERE
  rb := resp.Response.(*extprocv3.ProcessingResponse_RequestBody)

This bug was introduced in commit 3445fa9 (PR #1758) when refactoring the redaction logic. The old code had proper type checking:

  original, ok := resp.Response.(*extprocv3.ProcessingResponse_RequestBody)
  if !ok || original.RequestBody == nil {
      // Meaning this is the immediate response, that doesn't need to be filtered.
      return resp
  }

  Solution

  Replace the unsafe type assertion with a type switch that handles both response types:

  switch val := resp.Response.(type) {
  case *extprocv3.ProcessingResponse_RequestBody:
      logContent := redactRequestBodyResponse(val, l, sensitiveHeaderKeys, s.enableRedaction)
      l.Debug("request body processed", slog.Any("response", logContent))
  case *extprocv3.ProcessingResponse_ImmediateResponse:
      // ImmediateResponse (e.g., for malformed requests) doesn't need request body redaction
      l.Debug("request body processed", slog.Any("response", val))
  }

  This matches the pattern already used for response body processing at server.go:338-343.

  Testing

  Added test case: TestServer_processMsg/request_body_with_immediate_response_(malformed_request)

  This test verifies that processMsg correctly handles ImmediateResponse from ProcessRequestBody without panicking.

  All tests pass:
  $ go test -v ./internal/extproc/
  PASS
  ok    github.com/envoyproxy/ai-gateway/internal/extproc       3.920s

  Reproduction

  Before this fix, any malformed JSON request would crash the container:

  curl -X POST https://url.com/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{"messages":[{"role":"user","content":{"invalid":"json structure"}}]}'

  Logs show:
  level=INFO msg="returning user-facing error for malformed request"
    error="malformed request: failed to parse JSON for /v1/chat/completions"
  panic: interface conversion: ext_procv3.isProcessingResponse_Response is
    *ext_procv3.ProcessingResponse_ImmediateResponse, not
    *ext_procv3.ProcessingResponse_RequestBody

  After this fix, the same request returns a proper 400 error without crashing.

  Files Changed

  - internal/extproc/server.go: Fixed unsafe type assertion with type switch
  - internal/extproc/server_test.go: Added regression test

  ---
  Fixes: Pod restarts in gateway-system namespace due to malformed request handling
  Related: PR #1758 (commit 3445fa9) that introduced the regression

---------

Signed-off-by: Alexa Griffith <agriffith50@bloomberg.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants