-
Notifications
You must be signed in to change notification settings - Fork 395
Description
When forwarding AWS WAF logs with the Datadog Forwarder Lambda v5.2.1, the message field often arrives as a JSON object (dict), not a string. The forwarder tries to scrub message assuming it’s a string, which throws an exception, so the early scrubbing step fails. The batch-level scrubbing still hides sensitive fields in the final payload, but the intermediate error shows up and the early scrubbing is skipped.
Environment
Forwarder: datadog-serverless-functions, [aws-dd-forwarder-5.2.1 (Layer v95)]
Log source: AWS WAF (WAFv2) via CloudWatch Logs
Runtime: AWS Lambda
Steps to Reproduce
Deploy Forwarder v5.2.1.
Subscribe it to an AWS WAF CloudWatch Logs group.
Produce a WAF event where message is an object (dict/list).
Observe an exception when scrubbing message, because it’s not a string.
Environment variables:
DD_SCRUBBING_RULE (Basic|Bearer)\s[^"]+
DD_SCRUBBING_RULE_REPLACEMENT HIDDEN
Expected Behavior
No exception even if message is dict/list/bytes.
Scrubbing applied safely, or skipped locally and applied after batch serialization.
Actual Behavior
The code attempts to scrub message directly; if message is dict/list, it fails.
This creates noisy errors and the early scrubbing of message doesn’t run.
The global scrubbing on the serialized batch still masks sensitive fields.
[ERROR] 2026-02-03T09:57:06.263Z 6d700627-90fe-4a0c-a31e-f0cda8a7f9fc Exception while scrubbing log message {'timestamp': 1770112318534, 'formatVersion': 1, 'webaclId': 'arn:aws:wafv2:
Workaround Applied
We added the following normalization before scrubbing message at forwarder.py line 101:
import json
log["message"] = scrubber.scrub(log["message"])
evaluated_log = log["message"]
if isinstance(log["message"], (dict, list)):
payload = json.dumps(log["message"], ensure_ascii=False, separators=(",", ":"), default=str)
elif isinstance(log["message"], (bytes, bytearray)):
payload = log["message"].decode("utf-8", "replace")
elif isinstance(log["message"], str):
payload = log["message"]
else:
payload = str(log["message"])
log["message"] = scrubber.scrub(payload)
evaluated_log = log["message"]
Proposed Fix
In the “apply scrubbing rules to inner log message” section, guard by type:
If message is str, scrub directly.
If dict/list, json.dumps(..., default=str) then scrub.
If bytes, decode to UTF-8 (replace) then scrub.
Otherwise, str() then scrub.
Minimal alternative: if message is not str, skip local scrubbing and rely on the batch-level scrubbing after serialization.
Impact
Unnecessary exceptions in Forwarder logs.
Early scrubbing doesn’t run for non-string messages, relying solely on batch-level scrubbing.
Happy to provide more details or open a PR if helpful.