Skip to content

fix: decode response body as UTF-8 when no charset specified in DevToolsPlugin#1570

Merged
garrytrinder merged 3 commits intodotnet:mainfrom
waldekmastykarz:fix/unicode-encoding-devtools
Mar 2, 2026
Merged

fix: decode response body as UTF-8 when no charset specified in DevToolsPlugin#1570
garrytrinder merged 3 commits intodotnet:mainfrom
waldekmastykarz:fix/unicode-encoding-devtools

Conversation

@waldekmastykarz
Copy link
Collaborator

Summary

When the Content-Type header doesn't include a charset (e.g. application/json instead of application/json; charset=utf-8), the underlying proxy library defaults to ISO-8859-1 per the obsolete RFC 2616. This causes Unicode characters (e.g. ``) to appear garbled in the DevTools inspector.

Fix

Default to UTF-8 when no charset is specified in the Content-Type header, aligning with modern standards:

  • RFC 7231 (2014): Removed the ISO-8859-1 default from HTTP/1.1
  • RFC 8259 (2017): JSON MUST be encoded as UTF-8

When a charset is explicitly specified, we honor it.

Changes

  • Added GetBodyString helper that decodes raw response bytes using UTF-8 when no charset is present, or uses the specified charset otherwise
  • Applied to both request body (PostData) and response body in DevToolsPlugin

Fixes #1566

@waldekmastykarz waldekmastykarz requested a review from a team as a code owner February 28, 2026 10:09
Copilot AI review requested due to automatic review settings February 28, 2026 10:09
@waldekmastykarz waldekmastykarz added the pr-bugfix Fixes a bug label Feb 28, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates DevToolsPlugin’s request/response body decoding so that when Content-Type omits a charset, bodies are decoded as UTF-8 instead of relying on the proxy library’s ISO-8859-1 default, fixing garbled Unicode in the DevTools inspector (issue #1566).

Changes:

  • Introduced a GetBodyString helper to decode raw body bytes as UTF-8 when no charset is specified.
  • Switched DevTools request PostData and text response body handling to use the new helper.

Copy link
Contributor

@garrytrinder garrytrinder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Built and tested in detached mode. Clean startup, no errors. The GetBodyString method correctly parses charset from Content-Type, falls back to UTF-8 for modern web content (per RFC 7231/8259) instead of the library default ISO-8859-1. Applied consistently in both request and response paths. Good comment explaining the rationale. LGTM.

@garrytrinder garrytrinder enabled auto-merge (squash) March 2, 2026 10:29
waldekmastykarz and others added 3 commits March 2, 2026 10:58
…olsPlugin

When the Content-Type header doesn't include a charset (e.g. 'application/json'
vs 'application/json; charset=utf-8'), the underlying proxy library defaults to
ISO-8859-1 per the obsolete RFC 2616. This causes Unicode characters to appear
garbled in the DevTools inspector.

Default to UTF-8 decoding when no charset is specified, which aligns with
modern standards (RFC 7231, RFC 8259).

Fixes dotnet#1566

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Use System.Net.Mime.ContentType to parse the charset from Content-Type
headers instead of manual string splitting. This handles quoted charsets
(e.g. charset="utf-8") and gracefully falls back to UTF-8 for
malformed headers or unsupported charsets.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@garrytrinder garrytrinder force-pushed the fix/unicode-encoding-devtools branch from 4b19292 to d28b372 Compare March 2, 2026 10:58
@garrytrinder garrytrinder merged commit ef5ec23 into dotnet:main Mar 2, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-bugfix Fixes a bug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG]: Messy code when Inspect unicode character by DevToolsPlugin

3 participants