Skip to content

XmlLayout: replace invalid XML characters with U+FFFD#4077

Merged
ppkarwasz merged 6 commits into2.25.xfrom
fix/2.25.x/xml-control-characters
Mar 24, 2026
Merged

XmlLayout: replace invalid XML characters with U+FFFD#4077
ppkarwasz merged 6 commits into2.25.xfrom
fix/2.25.x/xml-control-characters

Conversation

@ppkarwasz
Copy link
Copy Markdown
Contributor

This change sanitizes the output of XmlLayout by replacing characters that are not permitted in XML 1.0 with the Unicode replacement character (U+FFFD).

This guarantees that the generated log output is always well-formed XML and can be parsed by any XML 1.0–compliant parser, even when log data contains control characters or other invalid code points.

This change sanitizes the output of `XmlLayout` by replacing characters that are not permitted in XML 1.0 with the Unicode replacement character (`U+FFFD`).

This guarantees that the generated log output is always well-formed XML and can be parsed by any XML 1.0–compliant parser, even when log data contains control characters or other invalid code points.

Co-authored-by: Volkan Yazıcı <volkan@yazi.ci>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens XmlLayout output generation by sanitizing characters that are not permitted in XML 1.0, replacing them with U+FFFD so log output remains well-formed and parseable.

Changes:

  • Introduces a SanitizingXmlFactory / SanitizingWriter wrapper for Jackson XML output that replaces invalid XML 1.0 code points with U+FFFD.
  • Adds stax2-api dependency management and a core dependency needed for the new StAX2 writer delegation.
  • Adds tests covering invalid/valid XML character handling and a changelog entry.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/changelog/.2.x.x/4077_xml-control-characters.xml Changelog entry documenting XML sanitization behavior.
log4j-parent/pom.xml Adds managed version for org.codehaus.woodstox:stax2-api.
log4j-core/src/main/java/.../Log4jXmlObjectMapper.java Implements XML 1.0 character sanitization via a custom XmlFactory/XMLStreamWriter delegate.
log4j-core/pom.xml Updates OSGi package import pattern and adds optional stax2-api dependency.
log4j-core-test/src/test/java/.../XmlLayoutTest.java Adds tests intended to verify invalid XML characters are sanitized.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@vy vy added bug Incorrect, unexpected, or unintended behavior of existing code layouts Affects one or more Layout plugins labels Mar 24, 2026
@vy vy added this to the 2.25.4 milestone Mar 24, 2026
@ppkarwasz ppkarwasz merged commit 4f50142 into 2.25.x Mar 24, 2026
7 checks passed
@ppkarwasz ppkarwasz deleted the fix/2.25.x/xml-control-characters branch March 24, 2026 22:56
@github-project-automation github-project-automation bot moved this from Approved to Merged in Log4j pull request tracker Mar 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Incorrect, unexpected, or unintended behavior of existing code layouts Affects one or more Layout plugins

Projects

Development

Successfully merging this pull request may close these issues.

3 participants