Skip to content

Backslash escaping of CR in XML output #1648

@nnposter

Description

@nnposter

Any CR characters in script output are backslash-escaped in XML output, while LFs are not:

<elem key="output">No LSB modules are available.\x0D&#xa;Distributor ID:&#x9;Ubuntu\x0D&#xa;Description:&#x9;Ubuntu 16.04.6 LTS\x0D&#xa;Release:&#x9;16.04\x0D&#xa;Codename:&#x9;xenial\x0D&#xa;</elem>

The patch below rectifies the issue, treating CRs just like LFs:

<elem key="output">No LSB modules are available.&#xd;&#xa;Distributor ID:&#x9;Ubuntu&#xd;&#xa;Description:&#x9;Ubuntu 16.04.6 LTS&#xd;&#xa;Release:&#x9;16.04&#xd;&#xa;Codename:&#x9;xenial&#xd;&#xa;</elem>
* Prevents backslash-escaping of CR characters in XML output
--- a/output.cc
+++ b/output.cc
@@ -493,8 +493,21 @@
    xml_write_escaped is not enough; some characters are not allowed to appear in
    XML, not even escaped. */
 std::string protect_xml(const std::string s) {
-  /* escape_for_screen is good enough. */
-  return escape_for_screen(s);
+  std::string r;
+
+  for (unsigned int i = 0; i < s.size(); i++) {
+    char buf[5];
+    unsigned char c = s[i];
+    // Printable and some whitespace ok.
+    if (c == '\t' || c == '\r' || c == '\n' || (0x20 <= c && c <= 0x7e)) {
+      r += c;
+    } else {
+      Snprintf(buf, sizeof(buf), "\\x%02X", c);
+      r += buf;
+    }
+  }
+
+  return r;
 }
 
 /* This is a helper function to determine the ordering of the script results

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions