-
Notifications
You must be signed in to change notification settings - Fork 4.4k
darwin: download_and_extract fails on archive containing file with unicode name #7055
Description
Description of the problem / feature request:
The Starlark repository_ctx.download_and_extract method fails when it's asked to extract an archive that contains files with unusual unicode characters in their name. Specifically, in the Go 1.12b1 SDK, there is a test file named test/fixedbugs/issue27836.dir/Äfoo.go. When Bazel is asked to extract the SDK, it fails:
hello $ bazel fetch @go_sdk//...
INFO: Invocation ID: ef42f18b-4f19-4508-8e2c-268f4b2ac830
Loading: 0 packages loaded
ERROR: Traceback (most recent call last):
File "/private/var/tmp/_bazel_jayconrod/8b040ee1d8994791263025bc89f96607/external/io_bazel_rules_go/go/private/sdk.bzl", line 51
_remote_sdk(ctx, [url.format(filename) for url...], <2 more arguments>)
File "/private/var/tmp/_bazel_jayconrod/8b040ee1d8994791263025bc89f96607/external/io_bazel_rules_go/go/private/sdk.bzl", line 113, in _remote_sdk
ctx.download_and_extract(url = urls, stripPrefix = strip_pr..., ...)
java.io.FileNotFoundException: /private/var/tmp/_bazel_jayconrod/8b040ee1d8994791263025bc89f96607/external/go_sdk/test/fixedbugs/issue27836.dir/A?foo.go (No such file or directory)
Loading: 0 packages loaded
Loading: 0 packages loaded
This file is not part of the build. It's simply part of the archive we need to extract.
This seems to affect macOS. I'm on an APFS file system; not sure about HFS+. On Linux, the file is extracted as ''$'\304''foo.go' (at least that's what ls spits out). Windows works, but the Windows SDK is a .zip file, so probably a different code path.
Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
Create a WORKSPACE file like this:
load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")
http_archive(
name = "io_bazel_rules_go",
sha256 = "7be7dc01f1e0afdba6c8eb2b43d2fa01c743be1b9273ab1eaf6c233df078d705",
urls = ["https://github.com/bazelbuild/rules_go/releases/download/0.16.5/rules_go-0.16.5.tar.gz"],
)
load("@io_bazel_rules_go//go:def.bzl", "go_download_sdk", "go_register_toolchains", "go_rules_dependencies")
go_download_sdk(
name = "go_sdk",
sdks = {
"darwin_amd64": ("go1.12beta1.darwin-amd64.tar.gz", "e49bf83ae10b2232d2efa918f0e9df1d76f93a0c6b0ea18c11edd9ef9defa505"),
"linux_amd64": ("go1.12beta1.linux-amd64.tar.gz", "65bfd4a99925f1f85d712f4c1109977aa24ee4c6e198162bf8e819fdde19e875"),
},
)
go_rules_dependencies()
go_register_toolchains()
Run this command:
bazel fetch @go_sdk//...
What operating system are you running Bazel on?
macOS 10.14.2
What's the output of bazel info release?
release 0.21.0
Have you found anything relevant by searching the web?
Relevant issues:
- Go 1.12b1 doesn't work due to files with utf8 chars bazel-contrib/rules_go#1880 - Original report
- Rename SymlinkEvent.to to SymlinkEvent.path and SymlinkEvent.from to SymlinkEvent.target. #6885 - closely related (honestly this is probably a duplicate), but wrapped in an extra layer
- Allow any characters in filenames / labels #374 - also related, but seems more general. Not about
download_and_extractspecifically.
Any other information, logs, or outputs that you want to share?
This will break rules_go when Go 1.12 ships in February. We'll add a workaround to avoid calling ctx.download_and_extract on macOS.