[core] Fix XMLRenderer encoding issues#2633
Conversation
The XMLRenderer uses by default UTF-8 encoding, but the writer uses the system default encoding, which doesn't work well together. Provide new experimental API Renderer::setReportFile, so that renderer implementations can create their own writers. The default implementation in AbstractRenderer is backwards compatible.
Generated by 🚫 Danger |
|
This PR showed actually a bug in pmd-regression-tester: With this PR, we are now escaping certain characters in XML, which we previously didn't: before: <?xml version="1.0" encoding="UTF-8"?>
<pmd xmlns="http://pmd.sourceforge.net/report/2.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://pmd.sourceforge.net/report/2.0.0 http://pmd.sourceforge.net/report_2_0_0.xsd"
version="6.24.0" timestamp="2020-07-05T21:26:00.982">
<file name="/home/andreas/Downloads/chunk-diff-issue-2615/src/TestFile.java">
<violation beginline="2" endline="2" begincolumn="21" endcolumn="35" rule="MethodArgumentCouldBeFinal" ruleset="Code Style" class="TestFile" method="bar" variable="fileName" externalInfoUrl="https://pmd.github.io/pmd-6.24.0/pmd_rules_java_codestyle.html#methodargumentcouldbefinal" priority="3">
Parameter 'fileName' is not assigned and could be declared final
</violation>
</file>
</pmd>after: <?xml version="1.0" encoding="UTF-8"?>
<pmd xmlns="http://pmd.sourceforge.net/report/2.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://pmd.sourceforge.net/report/2.0.0 http://pmd.sourceforge.net/report_2_0_0.xsd"
version="6.26.0-SNAPSHOT" timestamp="2020-07-05T21:26:32.079">
<file name="/home/andreas/Downloads/chunk-diff-issue-2615/src/TestFile.java">
<violation beginline="2" endline="2" begincolumn="21" endcolumn="35" rule="MethodArgumentCouldBeFinal" ruleset="Code Style" class="TestFile" method="bar" variable="fileName" externalInfoUrl="https://pmd.github.io/pmd/pmd_rules_java_codestyle.html#methodargumentcouldbefinal" priority="3">
Parameter 'fileName' is not assigned and could be declared final
</violation>
</file>
</pmd>pmd-regression-tester doesn't parse the character data correctly - with entities, the character data between tags (where the description is), is now split into multiple chunks of character data, which needs to be accumulated. The parser currently only takes the last character data. Hence the difference in the violations, e.g. before: I'll fix that tomorrow. |
In order to properly support different encodings, a OutputStream is needed. Then Java will take care of unmappaple characters and encode them as entities for XML. For backwards compatibility, a writer is still created and exposed.
And use it in both CPD/PMD XMLRenderers.
|
I think, this is ready now. I've added the experimental API |
Describe the PR
This PR tries to fix #2615 . In order to use the correct encoding through the rendering process, the new API
Renderer::setReportFileis added.I'm not entirely sure about CPD renderer, whether we have a similar issue there. But as far as I can see, only the default platform encoding is used, there is no way to change the encoding. There is also no way to specify a output file as a command line option, so CPD renderers always render to Sysout and hence should use the default platform encoding.
Related issues
Ready?
./mvnw clean verifypasses (checked automatically by travis)