Skip to content

fix character escaping in checkstyle formatter#1104

Merged
m-ildefons merged 3 commits intohadolint:masterfrom
m-ildefons:checkstyle-escaped-characters
Aug 15, 2025
Merged

fix character escaping in checkstyle formatter#1104
m-ildefons merged 3 commits intohadolint:masterfrom
m-ildefons:checkstyle-escaped-characters

Conversation

@m-ildefons
Copy link
Copy Markdown
Member

Fix special character escaping in checkstyle formatter. The checkstyle formatter produces an XML document, where only <, >, &, ' and " need to be escaped.

related-to: #1065

What I did

Fix special character escaping to match XML 1.0 expectation: https://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references#Standard_public_entity_sets_for_characters

How to verify it

Previous to this change:

<?xml version='1.0' encoding='UTF-8'?><checkstyle version='4.3'><file name='/home/moritz/tmp/Dockerfile' ><error line='3' column='1' severity='warning' message='Specify version with &#96;dnf install &#45;y &#60;package&#62;&#45;&#60;version&#62;&#96;.' source='DL3041' /></file></checkstyle>

After this change:

<?xml version='1.0' encoding='UTF-8'?><checkstyle version='4.3'><file name='/home/moritz/tmp/Dockerfile' ><error line='3' column='1' severity='warning' message='Specify version with `dnf install -y &lt;package&gt;-&lt;version&gt;`.' source='DL3041' /></file></checkstyle>

Fix special character escaping in checkstyle formatter. The checkstyle
formatter produces an XML document, where only `<`, `>`, `&`, `'` and
`"` need to be escaped.

related-to: hadolint#1065

Signed-off-by: Moritz Röhrich <moritz@ildefons.de>
@m-ildefons m-ildefons self-assigned this Aug 4, 2025
@m-ildefons m-ildefons added the bug label Aug 4, 2025
@m-ildefons
Copy link
Copy Markdown
Member Author

@rantoniuk would this fix your issue?

@rantoniuk
Copy link
Copy Markdown

I didn't test it (as I would need to manually pull the related GH action, build it, publish it and then update my repository to use that custom image), but from your before/after description it looks okay.

then Text.singleton c
else "&#" <> Text.pack (show (ord c)) <> ";"
isOk x = any (\check -> check x) [isAsciiUpper, isAsciiLower, isDigit, (`elem` [' ', '.', '/'])]
else xmlEscape c
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you'd like to use one of the libraries instead?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's probably overall the best way to go.
I had thought about it but wasn't quite sure which library was best to choose.
I've now decided to go with xml-conduit. It was a quite a bit of work to move the code over, but I'm happy as there is now overall less code and the bits that are there seem less fragile to me.

Use xml-conduit to do XML document generation in the checkstyle
formatter.
Using a library with robust encoding routines ensures less problems,
e.g. with escaped characters.

Signed-off-by: Moritz Röhrich <moritz@ildefons.de>
Signed-off-by: Moritz Röhrich <moritz@ildefons.de>
@m-ildefons m-ildefons added the formatter This PR/issue relates to output formatters label Aug 14, 2025
@m-ildefons m-ildefons merged commit 19ed357 into hadolint:master Aug 15, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug formatter This PR/issue relates to output formatters

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants