Skip to content

AWS WAF logs: scrubbing fails when message is a dict (Forwarder v5.2.1) #1059

@vinarock

Description

@vinarock

When forwarding AWS WAF logs with the Datadog Forwarder Lambda v5.2.1, the message field often arrives as a JSON object (dict), not a string. The forwarder tries to scrub message assuming it’s a string, which throws an exception, so the early scrubbing step fails. The batch-level scrubbing still hides sensitive fields in the final payload, but the intermediate error shows up and the early scrubbing is skipped.

Environment

Forwarder: datadog-serverless-functions, [aws-dd-forwarder-5.2.1 (Layer v95)]
Log source: AWS WAF (WAFv2) via CloudWatch Logs
Runtime: AWS Lambda

Steps to Reproduce

Deploy Forwarder v5.2.1.
Subscribe it to an AWS WAF CloudWatch Logs group.
Produce a WAF event where message is an object (dict/list).
Observe an exception when scrubbing message, because it’s not a string.
Environment variables:
  DD_SCRUBBING_RULE (Basic|Bearer)\s[^"]+
  DD_SCRUBBING_RULE_REPLACEMENT HIDDEN

Expected Behavior

No exception even if message is dict/list/bytes.
Scrubbing applied safely, or skipped locally and applied after batch serialization.

Actual Behavior

The code attempts to scrub message directly; if message is dict/list, it fails.
This creates noisy errors and the early scrubbing of message doesn’t run.
The global scrubbing on the serialized batch still masks sensitive fields.

[ERROR]	2026-02-03T09:57:06.263Z	6d700627-90fe-4a0c-a31e-f0cda8a7f9fc	Exception while scrubbing log message {'timestamp': 1770112318534, 'formatVersion': 1, 'webaclId': 'arn:aws:wafv2:

Workaround Applied
We added the following normalization before scrubbing message at forwarder.py line 101:

import json
log["message"] = scrubber.scrub(log["message"])
evaluated_log = log["message"]

if isinstance(log["message"], (dict, list)):
payload = json.dumps(log["message"], ensure_ascii=False, separators=(",", ":"), default=str)
elif isinstance(log["message"], (bytes, bytearray)):
payload = log["message"].decode("utf-8", "replace")
elif isinstance(log["message"], str):
payload = log["message"]
else:
payload = str(log["message"])

log["message"] = scrubber.scrub(payload)
evaluated_log = log["message"]

Proposed Fix

In the “apply scrubbing rules to inner log message” section, guard by type:
    If message is str, scrub directly.
    If dict/list, json.dumps(..., default=str) then scrub.
    If bytes, decode to UTF-8 (replace) then scrub.
    Otherwise, str() then scrub.
Minimal alternative: if message is not str, skip local scrubbing and rely on the batch-level scrubbing after serialization.

Impact

Unnecessary exceptions in Forwarder logs.
Early scrubbing doesn’t run for non-string messages, relying solely on batch-level scrubbing.

Happy to provide more details or open a PR if helpful.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions