[FIX][SECURITY]: Add log sanitizer utility to prevent log injection attacks (#3000)#3227
Conversation
3d35a9d to
bcaef69
Compare
|
Thanks for this @smrutisahoo10. Log injection prevention is an important security measure and the approach of centralizing sanitization in a utility is clean. A few items: (1) full CI hasn't triggered yet — only DCO passed, please rebase or push to trigger the full suite, (2) the verification table in the PR description shows blank status — please run |
bcaef69 to
c32bd4b
Compare
|
Hi @crivetimihai, ran make test and make lint. Updated the verification table. |
fd225d0 to
fa135df
Compare
MohanLaksh
left a comment
There was a problem hiding this comment.
Approved. The below test file PASSED all Log Injection Vulnerabilities Tests.
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""Manual test script for log sanitizer to verify it works correctly."""
import sys
import logging
import io
# Set UTF-8 encoding for Windows console
if sys.platform == 'win32':
import codecs
sys.stdout = codecs.getwriter('utf-8')(sys.stdout.buffer, 'strict')
# Add mcpgateway to path
sys.path.insert(0, '.')
from mcpgateway.utils.log_sanitizer import sanitize_for_log
# Setup logging to capture output
handler = logging.StreamHandler(stream := io.StringIO())
handler.setFormatter(logging.Formatter('%(levelname)s:%(name)s:%(message)s'))
logger = logging.getLogger('test')
logger.addHandler(handler)
logger.setLevel(logging.DEBUG)
print("=" * 60)
print("Testing Log Injection Prevention")
print("=" * 60)
# Test 1: Normal text (should pass through unchanged)
print("\n1. Normal text test:")
normal_text = "normal error message"
logger.warning(f"OAuth error: {sanitize_for_log(normal_text)}")
output = stream.getvalue()
print(f"Input: {repr(normal_text)}")
print(f"Output: {output.strip()}")
assert "\n" not in output or output.count("\n") == 1, "Should be single line"
stream.truncate(0)
stream.seek(0)
# Test 2: Newline injection (the actual vulnerability)
print("\n2. Newline injection test (VULNERABILITY):")
malicious_input = "bad scope\nCRITICAL:root:SECURITY BREACH detected"
logger.warning(f"OAuth error: {sanitize_for_log(malicious_input)}")
output = stream.getvalue()
print(f"Input: {repr(malicious_input)}")
print(f"Output: {output.strip()}")
lines = output.strip().split('\n')
assert len(lines) == 1, f"Should be single line, got {len(lines)} lines"
assert "SECURITY BREACH" in output, "Content should be preserved"
print("✓ Attack prevented - single log line produced")
stream.truncate(0)
stream.seek(0)
# Test 3: CRLF injection
print("\n3. CRLF injection test:")
crlf_input = "error\r\nINFO:fake:Fake log entry"
logger.warning(f"OAuth error: {sanitize_for_log(crlf_input)}")
output = stream.getvalue()
print(f"Input: {repr(crlf_input)}")
print(f"Output: {output.strip()}")
lines = output.strip().split('\n')
assert len(lines) == 1, f"Should be single line, got {len(lines)} lines"
print("✓ CRLF attack prevented")
stream.truncate(0)
stream.seek(0)
# Test 4: Tab injection
print("\n4. Tab character test:")
tab_input = "col1\tcol2\tcol3"
logger.warning(f"Data: {sanitize_for_log(tab_input)}")
output = stream.getvalue()
print(f"Input: {repr(tab_input)}")
print(f"Output: {output.strip()}")
assert "\t" not in output, "Tabs should be removed"
print("✓ Tabs sanitized")
stream.truncate(0)
stream.seek(0)
# Test 5: Multiple control characters
print("\n5. Multiple control characters test:")
multi_input = "text\nwith\rmany\tcontrol\vchars"
logger.warning(f"Message: {sanitize_for_log(multi_input)}")
output = stream.getvalue()
print(f"Input: {repr(multi_input)}")
print(f"Output: {output.strip()}")
assert "\n" not in output or output.count("\n") == 1
assert "\r" not in output
assert "\t" not in output
assert "\v" not in output
print("✓ All control characters sanitized")
stream.truncate(0)
stream.seek(0)
# Test 6: Real-world OAuth scenario
print("\n6. Real-world OAuth callback scenario:")
error = "invalid_scope"
error_description = "Requested scope not available\nCRITICAL:security:BREACH DETECTED"
logger.warning(f"OAuth provider returned error callback: error={sanitize_for_log(error)}, description={sanitize_for_log(error_description)}")
output = stream.getvalue()
print(f"Error: {repr(error)}")
print(f"Description: {repr(error_description)}")
print(f"Output: {output.strip()}")
lines = output.strip().split('\n')
assert len(lines) == 1, f"Should be single line, got {len(lines)} lines"
print("✓ Real-world scenario protected")
print("\n" + "=" * 60)
print("✅ ALL TESTS PASSED - Log injection vulnerability fixed!")
print("=" * 60)
…ging Signed-off-by: Smruti Sahoo <smruti.sahoo1@ibm.com>
Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
fa135df to
00d02cb
Compare
crivetimihai
left a comment
There was a problem hiding this comment.
Review: Approved
Clean, focused security fix for CWE-117 (log injection via control characters in untrusted input).
What was reviewed
mcpgateway/utils/log_sanitizer.py— New utility withsanitize_for_log(),sanitize_dict_for_log(), andsanitize_optional(). Pre-compiled regex[\x00-\x1f\x7f-\x9f]correctly strips C0/C1 control characters while preserving printable space.- Router integration — Applied at 4 sites where untrusted user input (OAuth callback query params, SSO redirect_uri, RFC 9728 path segments) is logged. Server-side values correctly left unsanitized.
- Test suite — 45 tests with 100% coverage on the new module, including real-world attack scenarios (CRLF injection, log forging, SIEM evasion).
Changes made during review
- Rebased onto
main(clean, no conflicts) - Removed unused
import pytestfrom test file
Notes
- Unicode line separators (U+2028/U+2029) are intentionally not filtered — Python's logging module doesn't treat them as newlines, so they're not an injection vector. This is correctly documented in the tests.
- No security or performance issues introduced.
Signed-off-by: Mihai Criveti <crivetimihai@gmail.com>
…ttacks (#3000) (#3227) * Added utility for strips and control character from values before logging Signed-off-by: Smruti Sahoo <smruti.sahoo1@ibm.com> * fix(tests): remove unused pytest import from log sanitizer tests Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * style(tests): remove trailing whitespace in log sanitizer tests Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> --------- Signed-off-by: Smruti Sahoo <smruti.sahoo1@ibm.com> Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> Co-authored-by: Smruti Sahoo <smruti.sahoo1@ibm.com> Co-authored-by: Mihai Criveti <crivetimihai@gmail.com>
…ttacks (#3000) (#3227) * Added utility for strips and control character from values before logging Signed-off-by: Smruti Sahoo <smruti.sahoo1@ibm.com> * fix(tests): remove unused pytest import from log sanitizer tests Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> * style(tests): remove trailing whitespace in log sanitizer tests Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> --------- Signed-off-by: Smruti Sahoo <smruti.sahoo1@ibm.com> Signed-off-by: Mihai Criveti <crivetimihai@gmail.com> Co-authored-by: Smruti Sahoo <smruti.sahoo1@ibm.com> Co-authored-by: Mihai Criveti <crivetimihai@gmail.com> Signed-off-by: Yosief Eyob <yosiefogbazion@gmail.com>
…ging
🔗 Related Issue
Closes #3000
📝 Summary
Added utility function to strip control characters from values before logging to prevent log injection attacks. This prevents malicious input from manipulating log output by injecting newlines, carriage returns, or other control characters that could be used to forge log entries or hide malicious activity.
Files Added:
Modified files:
🏷️ Type of Change
🧪 Verification
make lintmake testmake coverage✅ Checklist
make black isort pre-commit)📓 Notes (optional)
Screenshots, design decisions, or additional context.