Sanitize ANSII control characters returned from the server #6916

samcoe · 2023-01-25T17:51:41Z

Fixes https://github.com/github/cli/issues/149
Fixes #932

mislav

Looks quite elegant! Thanks!

mislav · 2023-01-26T12:00:36Z

api/http_client.go

+// the values of \u0080 to \u009F represent C1 ASCII control characters. These control
+// characters will be interpreted by the terminal, this behaviour can be used maliciously
+// as an attack vector, especially the control character \u001B. This function escapes
+// all non-printable characters between \u0000 and \u00FF so that the terminal will


My understanding is that non-printable ASCII characters are in the range \u0000 and \u001F. If we escape every character up until \u00FF, wouldn't that also affect all printable characters in ASCII and beyond?

Would we explicitly allow-list some control ASCII characters such as LF and CR so that they don't get mangled?

The server only returns \u string sequence for non-printable characters so we do not need to worry about escaping printable characters.

LF and CR are printable characters and the server does not return them as \u sequences.

mislav · 2023-01-26T12:00:36Z

api/http_client.go

@@ -64,6 +68,8 @@ func NewHTTPClient(opts HTTPClientOptions) (*http.Client, error) {
 		client.Transport = AddAuthTokenHeader(client.Transport, opts.Config)
 	}

+	client.Transport = SanitizeASCIIControlCharacters(client.Transport)


Does this middleware also affect the gh api command? Since gh api is meant for low-level, programmatic use, maybe it would be good if it remained unaffected.

It does currently affect the gh api command. In my mind I think we should sanitize those requests as well as we want to protect our users throughout our tool. I can envision people copying and pasting gh api commands that request a URL with malicious data in the response. If there is a valid use case for leaving these requests un-sanitized then I am open to it but I have been unable to think of one.

The way I see it, gh api is meant for scripts, not necessarily to implement presentation. But agreed that flags like --jq and especially --template can be used to format data for the terminal. In those cases it's good if ANSI escape codes are disabled. Maybe it's okay that this middleware sanitizes all those.

mislav · 2023-01-30T18:16:58Z

api/http_client.go

+		var sanitized bytes.Buffer
+		err = replaceControlCharacters(res.Body, &sanitized)
+		if err != nil {
+			err = fmt.Errorf("ascii control characters sanitization error: %w", err)
+		}
+		res.Body.Close()
+		res.Body = io.NopCloser(&sanitized)


One consequence of this middleware is that by default, all Bodies of every response will first be fully read into a Buffer before the middleware yields control to other middleware or to other response processing code. This prevents any kind of streaming processing of response JSON.

Would it be feasible to replace resp.Body with an io.ReadCloser that does escaping on the fly as it forwards reads to the original response body and returns each result? That way, every response from the server will be immediately available to the code that makes the request without this middleware potentially delaying that.

I like that idea. I will investigate implementing it in that manner.

mislav · 2023-01-30T18:16:58Z

api/http_client.go

+// mapControlCharacterToCaret maps C0 control sequences to caret notation and
+// C1 control sequences to hex notation. C1 control sequences do
+// not have caret notation representation.
+func mapControlCharacterToCaret(b []byte) ([]byte, bool) {


Love the bit of added UX by mapping unsafe escape codes to their printable equivalents 👍

mislav · 2023-01-30T18:19:02Z

Forgot one comment in my last review: could there be a test that confirms that this middleware will not try to escape already escaped unicode codepoints in JSON? E.g. that \\u001b should be allowed to stay \\u001b without being converted to \\\u001b

samcoe · 2023-01-31T04:55:19Z

@mislav I changed the implementation to allow for streaming JSON responses, and I also moved the sanitization code to its own files for clarity.

The test I have does include some escaped sequences as well, let me know if you want me to make it more explicit.

mislav

This is a great improvement; thank you! Only style nits remain

mislav · 2023-02-01T13:34:01Z

api/sanitize_ascii.go

+			break
+		}
+
+		if win[0] == 92 && win[1] == 117 && win[2] == 48 && win[3] == 48 {


Nit: for clarity, these numbers could be saved into consts and expressed using Go single quotes instead of ASCII codes. For example, '\\' and 'u' should be more clear than 92 and 117, respectively.

However, in this case, maybe bytes.HasPrefix could suffice?

Suggested change

if win[0] == 92 && win[1] == 117 && win[2] == 48 && win[3] == 48 {

if bytes.HasPrefix(win, []byte("\\u00")) {

mislav · 2023-02-01T13:34:01Z

api/sanitize_ascii.go

+}
+
+// Close closes the wrapped ReadCloser.
+func (s *sanitizeASCIIReadCloser) Close() error {


Nit: since your struct is embedding an io.ReadCloser interface, I don't think you need to explicitly declare this method. Your type should already automatically have a Close() method and forward it to the underlying ReadCloser.

I gave the ReadCloser wrapped by sanitizeASCIIReadCloser a name in the struct so Go won't promote the Close method to be on sanitizeASCIIReadCloser. Having said that I think it makes sense to making the ReadCloser actually embedded.

samcoe requested a review from a team as a code owner January 25, 2023 17:51

samcoe self-assigned this Jan 25, 2023

samcoe requested review from mislav and removed request for a team January 25, 2023 17:51

cliAutomation added this to Needs review 🤔 in The GitHub CLI Jan 25, 2023

samcoe requested a review from vilmibm January 25, 2023 18:05

mislav reviewed Jan 26, 2023

View changes

Sanitize ANSII control characters returned from the server

fe95ce6

samcoe force-pushed the ansi-escaping branch from bb0c4ba to 67c327d Compare January 30, 2023 04:25

samcoe requested a review from mislav January 30, 2023 04:25

samcoe force-pushed the ansi-escaping branch from 67c327d to 93f532a Compare January 30, 2023 05:11

New approach

1035bdf

samcoe force-pushed the ansi-escaping branch from 93f532a to 1035bdf Compare January 30, 2023 17:12

mislav reviewed Jan 30, 2023

View changes

Streaming and also move to new file for clarity

3f854c5

samcoe force-pushed the ansi-escaping branch from 6c4f95f to 3f854c5 Compare January 31, 2023 04:47

samcoe requested a review from mislav January 31, 2023 04:51

mislav approved these changes Feb 1, 2023

View changes

The GitHub CLI automation moved this from Needs review 🤔 to Needs to be merged 🎉 Feb 1, 2023

Clean up nits

2504c8b

samcoe enabled auto-merge (squash) February 1, 2023 21:08

samcoe merged commit ced071f into trunk Feb 1, 2023
10 checks passed

The GitHub CLI automation moved this from Needs to be merged 🎉 to Pending Release 🥚 Feb 1, 2023

samcoe deleted the ansi-escaping branch February 1, 2023 21:19

Sanitize ANSII control characters returned from the server #6916

Sanitize ANSII control characters returned from the server #6916

samcoe commented Jan 25, 2023

mislav left a comment •

edited

mislav Jan 26, 2023

samcoe Jan 29, 2023

mislav Jan 26, 2023

samcoe Jan 29, 2023

mislav Jan 30, 2023

mislav Jan 30, 2023 •

edited

samcoe Jan 30, 2023

mislav Jan 30, 2023

mislav commented Jan 30, 2023

samcoe commented Jan 31, 2023

mislav left a comment

mislav Feb 1, 2023

mislav Feb 1, 2023

samcoe Feb 1, 2023

	if win[0] == 92 && win[1] == 117 && win[2] == 48 && win[3] == 48 {
	if bytes.HasPrefix(win, []byte("\\u00")) {

Sanitize ANSII control characters returned from the server #6916

Sanitize ANSII control characters returned from the server #6916

Conversation

samcoe commented Jan 25, 2023

mislav left a comment • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mislav Jan 30, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mislav commented Jan 30, 2023

samcoe commented Jan 31, 2023

mislav left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mislav left a comment •

edited

mislav Jan 30, 2023 •

edited