Skip to content

fix: handle UTF-16LE-encoded manifests on Windows#1021

Merged
tonyaiuto merged 1 commit intobazelbuild:mainfrom
rdesgroppes:fix-handling-of-utf16-manifests-on-windows
Feb 19, 2026
Merged

fix: handle UTF-16LE-encoded manifests on Windows#1021
tonyaiuto merged 1 commit intobazelbuild:mainfrom
rdesgroppes:fix-handling-of-utf16-manifests-on-windows

Conversation

@rdesgroppes
Copy link
Copy Markdown
Contributor

Prior to Bazel 8, manifest files containing non-ASCII characters were written with UTF-16LE encoding instead of UTF-8 on Windows:

This led to disable failing tests in CI:

//tests/zip:unicode_test:

File "pkg\private\manifest.py", line 59, in read_entries_from
  raw_entries = json.loads(fh.read())
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbb in position 338: invalid start byte

//tests/mappings:utf8_manifest_test:

File "tests\mappings\manifest_test_lib.py", line 39, in assertManifestsMatch
  got = json.loads(g_fp.read())
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbb in position 354: invalid start byte

Since the manifest is plain JSON, the fix simply consists in detecting whether the second byte is 0, where the default UTF-8 decoding would fail, in which case we assume the file is UTF-16LE-encoded.

The code is slightly reorganized to factor out the encoding selection.

This allows to enable //tests/mappings:utf8_manifest_test and //tests/zip:unicode_test tests in Windows CI.

@rdesgroppes rdesgroppes marked this pull request as ready for review February 8, 2026 16:44
@rdesgroppes rdesgroppes force-pushed the fix-handling-of-utf16-manifests-on-windows branch from 24b2ac0 to a79b1ee Compare February 10, 2026 07:30
Prior to Bazel 8, manifest files containing non-ASCII characters were
written with UTF-16LE encoding instead of UTF-8 on Windows:
- bazelbuild/bazel#24231
- bazelbuild/bazel#24350
- bazelbuild/bazel#24403

This led to disable failing tests in CI:

`//tests/zip:unicode_test`:
```
File "pkg\private\manifest.py", line 59, in read_entries_from
  raw_entries = json.loads(fh.read())
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbb in position 338: invalid start byte
```

`//tests/mappings:utf8_manifest_test`:
```
File "tests\mappings\manifest_test_lib.py", line 39, in assertManifestsMatch
  got = json.loads(g_fp.read())
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbb in position 354: invalid start byte
```

Since the manifest is plain JSON, the fix simply consists in detecting
whether the second byte is `0`, where the default UTF-8 decoding would
fail, in which case we assume the file is UTF-16LE-encoded.

The code is slightly reorganized to factor out the encoding selection.

This allows to enable `//tests/mappings:utf8_manifest_test` and
`//tests/zip:unicode_test` tests in Windows CI.
@rdesgroppes rdesgroppes force-pushed the fix-handling-of-utf16-manifests-on-windows branch from a79b1ee to 35bf0df Compare February 10, 2026 07:36
Copy link
Copy Markdown
Collaborator

@cgrindel cgrindel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. @aiuto Do you want to take a look before we merge?

@tonyaiuto
Copy link
Copy Markdown
Collaborator

LGTM. I didn't have a chance to look until now.

@tonyaiuto tonyaiuto merged commit a7d9a54 into bazelbuild:main Feb 19, 2026
6 checks passed
rdesgroppes added a commit to DataDog/datadog-agent that referenced this pull request Feb 21, 2026
### What does this PR do?
Bump `rules_pkg` to Feb 21 `main, picking up:
- bazelbuild/rules_pkg#1021
- bazelbuild/rules_pkg#1024
- bazelbuild/rules_pkg#1025

### Motivation
The missing exit code propagation fixed by bazelbuild/rules_pkg#1024 is the main driver of this change because RPM build errors would go unnoticed.
@rdesgroppes rdesgroppes deleted the fix-handling-of-utf16-manifests-on-windows branch February 22, 2026 00:23
rdesgroppes added a commit to DataDog/datadog-agent that referenced this pull request Mar 20, 2026
### What does this PR do?
Bump `rules_pkg` to Feb 21 `main, picking up:
- bazelbuild/rules_pkg#1021
- bazelbuild/rules_pkg#1024
- bazelbuild/rules_pkg#1025

### Motivation
The missing exit code propagation fixed by bazelbuild/rules_pkg#1024 is the main driver of this change because RPM build errors would go unnoticed.
gh-worker-dd-mergequeue-cf854d bot pushed a commit to DataDog/datadog-agent that referenced this pull request Mar 20, 2026
### What does this PR do?
Bump `rules_pkg` to current `main`.

### Motivation
Pick up:
- bazelbuild/rules_pkg#1021: a contribution of ours
- bazelbuild/rules_pkg#1024: RPM build errors would go unnoticed without it
- bazelbuild/rules_pkg#1035: affects reproducibiliy of builds
- bazelbuild/rules_pkg#1044
- bazelbuild/rules_pkg#1046: another contribution of ours (@chouquette)
- bazelbuild/rules_pkg#1047: we might need this feature

Co-authored-by: regis.desgroppes <regis.desgroppes@datadoghq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants