Skip to content

Fix emoji encoding in recorded .har files#1557

Merged
garrytrinder merged 5 commits intomainfrom
copilot/fix-emoji-encoding-har-file
Feb 26, 2026
Merged

Fix emoji encoding in recorded .har files#1557
garrytrinder merged 5 commits intomainfrom
copilot/fix-emoji-encoding-har-file

Conversation

Copy link
Contributor

Copilot AI commented Feb 25, 2026

Emoji and other non-ASCII characters in API responses are written as \uXXXX escape sequences in HAR files, making them unreadable.

Before: "text": "Hello \uD83D\uDE00\u2728 World"
After: "text": "Hello 😀✨ World"

Two issues at play:

  • BMP non-ASCII characters (U+0080–U+FFFF): System.Text.Json's default encoder escapes all non-ASCII. Fixed by setting Encoder = JavaScriptEncoder.UnsafeRelaxedJsonEscaping on ProxyUtils.JsonSerializerOptions. Safe here since JSON output goes to files, never embedded in HTML.

  • Non-BMP characters (U+10000+, most emoji): Even UnsafeRelaxedJsonEscaping still emits surrogate pair escapes (\uD83D\uDE00) for characters above U+FFFF. Fixed by post-processing the serialized HAR JSON with a regex that decodes surrogate pairs back to literal UTF-8.

Original prompt

This section details on the original issue you should resolve

<issue_title>[BUG]: Emoji in recorded .har file incorrectly represented</issue_title>
<issue_description>### Description

When you record a .har file, emojis in API responses are incorrectly encoded

Expected behaviour

Emoji correctly displayed

Actual behaviour

Emoji incorrectly encoded

Steps to reproduce

Record a .har file for an API that responds with emoji

Dev Proxy Version

latest

Operating system (environment)

Windows

Shell

PowerShell

Configuration file

n/a

Additional Info

No response</issue_description>

<agent_instructions>Research this issue. See if you can reproduce it. Be creative in looking for ways to repro.</agent_instructions>

Comments on the Issue (you are @copilot in this section)

@waldekmastykarz At first glance, Dev Proxy seems to be working fine. Needs additional research

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Add UnsafeRelaxedJsonEscaping to ProxyUtils.JsonSerializerOptions to
preserve BMP non-ASCII characters as literal UTF-8 in JSON output.

Add surrogate pair unescaping in HarGeneratorPlugin to convert
\uXXXX\uYYYY escape sequences for non-BMP characters (like emoji
U+1F600+) back to their literal UTF-8 representation.

Co-authored-by: waldekmastykarz <11164679+waldekmastykarz@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix emoji representation in recorded .har files Fix emoji encoding in recorded .har files Feb 25, 2026
@waldekmastykarz waldekmastykarz marked this pull request as ready for review February 25, 2026 13:20
@waldekmastykarz waldekmastykarz requested a review from a team as a code owner February 25, 2026 13:20
Copilot AI review requested due to automatic review settings February 25, 2026 13:20
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request fixes emoji and non-ASCII character encoding in recorded HAR files. Previously, emoji like 😀✨ were represented as unreadable escape sequences (\uD83D\uDE00\u2728) in HAR files. The fix addresses two issues:

  1. BMP non-ASCII characters (U+0080–U+FFFF): System.Text.Json's default encoder escapes all non-ASCII characters. Fixed by configuring JavaScriptEncoder.UnsafeRelaxedJsonEscaping in the global ProxyUtils.JsonSerializerOptions.

  2. Non-BMP characters (U+10000+, most emoji): Even with UnsafeRelaxedJsonEscaping, characters above U+FFFF are still emitted as surrogate pair escapes. Fixed by post-processing the serialized JSON with a regex that decodes UTF-16 surrogate pairs back to literal UTF-8 characters.

Changes:

  • Added JavaScriptEncoder.UnsafeRelaxedJsonEscaping to global JSON serialization options
  • Implemented surrogate pair decoding in HarGeneratorPlugin to convert \uD83D\uDE00 patterns to literal emoji
  • Added necessary imports for encoding and regex functionality

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
DevProxy.Abstractions/Utils/ProxyUtils.cs Added UnsafeRelaxedJsonEscaping encoder to global JsonSerializerOptions for proper Unicode handling
DevProxy.Plugins/Generation/HarGeneratorPlugin.cs Added regex-based surrogate pair decoder to convert UTF-16 escape sequences to literal emoji in HAR files

Copy link
Contributor

@garrytrinder garrytrinder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested: builds clean, BMP non-ASCII chars render correctly in HAR files (verified with accented characters), surrogate pair regex logic validated. UnsafeRelaxedJsonEscaping is appropriate for a CLI tool. LGTM.

@garrytrinder garrytrinder enabled auto-merge (squash) February 26, 2026 11:24
@garrytrinder garrytrinder merged commit 07b360a into main Feb 26, 2026
4 checks passed
@waldekmastykarz waldekmastykarz deleted the copilot/fix-emoji-encoding-har-file branch February 26, 2026 17:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG]: Emoji in recorded .har file incorrectly represented

4 participants