Skip to content

darwin: download_and_extract fails on archive containing file with unicode name #7055

@jayconrod

Description

@jayconrod

Description of the problem / feature request:

The Starlark repository_ctx.download_and_extract method fails when it's asked to extract an archive that contains files with unusual unicode characters in their name. Specifically, in the Go 1.12b1 SDK, there is a test file named test/fixedbugs/issue27836.dir/Äfoo.go. When Bazel is asked to extract the SDK, it fails:

hello $ bazel fetch @go_sdk//...
INFO: Invocation ID: ef42f18b-4f19-4508-8e2c-268f4b2ac830
Loading: 0 packages loaded
ERROR: Traceback (most recent call last):
	File "/private/var/tmp/_bazel_jayconrod/8b040ee1d8994791263025bc89f96607/external/io_bazel_rules_go/go/private/sdk.bzl", line 51
		_remote_sdk(ctx, [url.format(filename) for url...], <2 more arguments>)
	File "/private/var/tmp/_bazel_jayconrod/8b040ee1d8994791263025bc89f96607/external/io_bazel_rules_go/go/private/sdk.bzl", line 113, in _remote_sdk
		ctx.download_and_extract(url = urls, stripPrefix = strip_pr..., ...)
java.io.FileNotFoundException: /private/var/tmp/_bazel_jayconrod/8b040ee1d8994791263025bc89f96607/external/go_sdk/test/fixedbugs/issue27836.dir/A?foo.go (No such file or directory)
Loading: 0 packages loaded
Loading: 0 packages loaded

This file is not part of the build. It's simply part of the archive we need to extract.

This seems to affect macOS. I'm on an APFS file system; not sure about HFS+. On Linux, the file is extracted as ''$'\304''foo.go' (at least that's what ls spits out). Windows works, but the Windows SDK is a .zip file, so probably a different code path.

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

Create a WORKSPACE file like this:

load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")

http_archive(
    name = "io_bazel_rules_go",
    sha256 = "7be7dc01f1e0afdba6c8eb2b43d2fa01c743be1b9273ab1eaf6c233df078d705",
    urls = ["https://github.com/bazelbuild/rules_go/releases/download/0.16.5/rules_go-0.16.5.tar.gz"],
)

load("@io_bazel_rules_go//go:def.bzl", "go_download_sdk", "go_register_toolchains", "go_rules_dependencies")

go_download_sdk(
    name = "go_sdk",
    sdks = {
        "darwin_amd64": ("go1.12beta1.darwin-amd64.tar.gz", "e49bf83ae10b2232d2efa918f0e9df1d76f93a0c6b0ea18c11edd9ef9defa505"),
        "linux_amd64": ("go1.12beta1.linux-amd64.tar.gz", "65bfd4a99925f1f85d712f4c1109977aa24ee4c6e198162bf8e819fdde19e875"),
    },
)

go_rules_dependencies()

go_register_toolchains()

Run this command:

bazel fetch @go_sdk//...

What operating system are you running Bazel on?

macOS 10.14.2

What's the output of bazel info release?

release 0.21.0

Have you found anything relevant by searching the web?

Relevant issues:

Any other information, logs, or outputs that you want to share?

This will break rules_go when Go 1.12 ships in February. We'll add a workaround to avoid calling ctx.download_and_extract on macOS.

Metadata

Metadata

Assignees

Labels

P2We'll consider working on this in future. (Assignee optional)team-ExternalDepsExternal dependency handling, remote repositiories, WORKSPACE file.type: bug

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions