Affects PMD Version: PMD 6.24.0
Description:
When running on a file with a different encoding than the system default, PMD can produce invalid XML.
Steps to reproduce:
Please provide detailed steps for how we can reproduce the bug.
- Use the following UTF-8 encoded source file: MyClass.java.zip
- Invoke PMD on Windows 10 as follows:
pmd -d C:/tmp/MyClass.java -rulesets category/java/errorprone.xml/AvoidDuplicateLiterals -format xml -language java > out.xml
This produces the following XML output: out.xml.zip
The encoding of this file should be UTF-8 (as indicated at the start of the file), but contains files in a different encoding. Running xmllint on it produces:
/tmp/out.xml:8: parser error : Input is not proper UTF-8, indicate encoding ! Bytes: 0xE9 0xE9 0xE9 0xE9 The String literal "��������" appears 5 times in this file; the first ^
In fact, in my case, these characters are in the Windows-1255 encoding. I think PMD uses this encoding because it uses the system property file.encoding. One workaround is to use the same encoding as the file (UTF-8 by default).
Running PMD through: [CLI]
Affects PMD Version: PMD 6.24.0
Description:
When running on a file with a different encoding than the system default, PMD can produce invalid XML.
Steps to reproduce:
Please provide detailed steps for how we can reproduce the bug.
pmd -d C:/tmp/MyClass.java -rulesets category/java/errorprone.xml/AvoidDuplicateLiterals -format xml -language java > out.xmlThis produces the following XML output: out.xml.zip
The encoding of this file should be UTF-8 (as indicated at the start of the file), but contains files in a different encoding. Running xmllint on it produces:
/tmp/out.xml:8: parser error : Input is not proper UTF-8, indicate encoding ! Bytes: 0xE9 0xE9 0xE9 0xE9 The String literal "��������" appears 5 times in this file; the first ^In fact, in my case, these characters are in the Windows-1255 encoding. I think PMD uses this encoding because it uses the system property
file.encoding. One workaround is to use the same encoding as the file (UTF-8 by default).Running PMD through: [CLI]