XmlLayout: replace invalid XML characters with U+FFFD#4077
Merged
Conversation
This change sanitizes the output of `XmlLayout` by replacing characters that are not permitted in XML 1.0 with the Unicode replacement character (`U+FFFD`). This guarantees that the generated log output is always well-formed XML and can be parsed by any XML 1.0–compliant parser, even when log data contains control characters or other invalid code points. Co-authored-by: Volkan Yazıcı <volkan@yazi.ci>
This was referenced Mar 24, 2026
Contributor
There was a problem hiding this comment.
Pull request overview
This PR hardens XmlLayout output generation by sanitizing characters that are not permitted in XML 1.0, replacing them with U+FFFD so log output remains well-formed and parseable.
Changes:
- Introduces a
SanitizingXmlFactory/SanitizingWriterwrapper for Jackson XML output that replaces invalid XML 1.0 code points withU+FFFD. - Adds
stax2-apidependency management and a core dependency needed for the new StAX2 writer delegation. - Adds tests covering invalid/valid XML character handling and a changelog entry.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
src/changelog/.2.x.x/4077_xml-control-characters.xml |
Changelog entry documenting XML sanitization behavior. |
log4j-parent/pom.xml |
Adds managed version for org.codehaus.woodstox:stax2-api. |
log4j-core/src/main/java/.../Log4jXmlObjectMapper.java |
Implements XML 1.0 character sanitization via a custom XmlFactory/XMLStreamWriter delegate. |
log4j-core/pom.xml |
Updates OSGi package import pattern and adds optional stax2-api dependency. |
log4j-core-test/src/test/java/.../XmlLayoutTest.java |
Adds tests intended to verify invalid XML characters are sanitized. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
log4j-core/src/main/java/org/apache/logging/log4j/core/jackson/Log4jXmlObjectMapper.java
Show resolved
Hide resolved
log4j-core/src/main/java/org/apache/logging/log4j/core/jackson/Log4jXmlObjectMapper.java
Show resolved
Hide resolved
log4j-core-test/src/test/java/org/apache/logging/log4j/core/layout/XmlLayoutTest.java
Show resolved
Hide resolved
log4j-core-test/src/test/java/org/apache/logging/log4j/core/layout/XmlLayoutTest.java
Outdated
Show resolved
Hide resolved
vy
approved these changes
Mar 24, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This change sanitizes the output of
XmlLayoutby replacing characters that are not permitted in XML 1.0 with the Unicode replacement character (U+FFFD).This guarantees that the generated log output is always well-formed XML and can be parsed by any XML 1.0–compliant parser, even when log data contains control characters or other invalid code points.