Skip to content

create_library_to_link: add mangle_name and unique_name arguments#8888

Closed
aherrmann wants to merge 11 commits intobazelbuild:masterfrom
aherrmann:create_library_to_link
Closed

create_library_to_link: add mangle_name and unique_name arguments#8888
aherrmann wants to merge 11 commits intobazelbuild:masterfrom
aherrmann:create_library_to_link

Conversation

@aherrmann
Copy link
Copy Markdown
Contributor

Closes #8180

  • Adds mangle_name and unique_name arguments to create_library_to_link
    • mangle_name allows users replicate the behavior of cc_library in their custom rules. The link placed under the solib directory will be mangled and no intermediate directory will be generated.
    • unique_name allows users to specify that the given library name is unique and no intermediate directory is required.
  • Adds the prefixLibraryPath argument to the internal CcCommon API to facilitate the above changes to create_library_to_link.
  • Adds shell tests for the mangle_name and unique_name arguments to create_library_to_link.

Motivation

See #8180 tweag/rules_haskell#958

Before this PR the new Starlark Cc API's create_library_to_link would always preserve the given library name and link it into the solib directory underneath an intermediate directory based on the library label and consumer label. These intermediate directories can grow very long, in particular if external workspaces are involved. In rules_haskell this causes issues on Windows and MacOS. On Windows the linker cannot look for libraries underneath search paths which exceed a certain path length. On MacOS the header size of a MACH-O binary/library is limited and the required runpaths/load commands would quickly exceed this limit due to the long paths generated by Bazel. In rules_haskell in particular these intermediate directories are unnecessary, as rules_haskell already mangles the library names itself to ensure their uniqueness. Furthermore, the Haskell compiler GHC expects library names of a particular shape, such that Bazel's builtin mangling cannot be used. This motivates the combination mangle_name = False, unique_name = True enabled by this PR.

cc @oquenchil @mboes @cocreature

@aherrmann aherrmann requested a review from hlopko as a code owner July 15, 2019 11:14
@aherrmann aherrmann force-pushed the create_library_to_link branch from 51cdeec to 6e3e4e2 Compare July 15, 2019 12:53
@aiuto aiuto added team-Rules-CPP Issues for C++ rules untriaged labels Jul 15, 2019
@aherrmann aherrmann force-pushed the create_library_to_link branch 2 times, most recently from 2fface6 to f85baf2 Compare July 15, 2019 16:36
@aherrmann
Copy link
Copy Markdown
Contributor Author

CI has passed and this is now ready for review.

The tests test the symlinks generated in the solib directory. Since these are not generated on Windows, I have disabled the tests on Windows.

@mboes
Copy link
Copy Markdown
Contributor

mboes commented Aug 10, 2019

This PR just exposes existing method parameters that are there in the Java API to Starlark code. It's currently a blocker for a few issues in rules_haskell.

@oquenchil
Copy link
Copy Markdown
Contributor

I will take a look at this on Monday next week. Even though this is in Java, it doesn't mean we want to expose it to Starlark. We are carefully reviewing every parameter we add to the API because unlike in Java fixing a mistake there is not as simple as an internal refactoring of our code.

@oquenchil
Copy link
Copy Markdown
Contributor

So if I understand correctly, you want more flexibility in the name used for the symlink created. I think better than the two booleans on the API, the change should add a String parameter which internally gets converted to a PathFragment. Then internally override the method getDynamicLibrarySymlink with a variation that doesn't take preserveName and prefixConsumer, but takes that PathFragment instead and where getMangledNamed() is not called.

That should give enough flexibility to do what you want. But first I have to ask, why was it causing issues to have the default preserveName=true and prefixConsumer=true. For a given instance, what was the name of the symlink produced and what was the name that you wanted and why did you want that different name?

Regarding the test, I would prefer if you remove the shell test altogether and add a new test here:
https://github.com/bazelbuild/bazel/blob/master/src/test/java/com/google/devtools/build/lib/rules/cpp/SkylarkCcCommonTest.java

Thanks for this PR!

@aherrmann
Copy link
Copy Markdown
Contributor Author

So if I understand correctly, you want more flexibility in the name used for the symlink created.

Yes, that is correct.

I think better than the two booleans on the API, the change should add a String parameter which internally gets converted to a PathFragment. Then internally override the method getDynamicLibrarySymlink with a variation that doesn't take preserveName and prefixConsumer, but takes that PathFragment instead and where getMangledNamed() is not called.

Interesting, that sounds feasible, I'll look into it.

But first I have to ask, why was it causing issues to have the default preserveName=true and prefixConsumer=true. For a given instance, what was the name of the symlink produced and what was the name that you wanted and why did you want that different name?

Sure, take tweag/rules_haskell@21c277b with Bazel 0.27.0 on Linux with the default GHC bindist. Then run bazel build //tests/binary-with-lib-dynamic.
This will produce

$ ls bazel-bin/_solib_k8/*
bazel-bin/_solib_k8/_U@rules_Uhaskell_S_Shaskell_Ctoolchain-libraries___Uexternal_Srules_Uhaskell_Ughc_Ulinux_Uamd64_Slib_Sbase-4.12.0.0:
libHSbase-4.12.0.0-ghc8.6.5.so

bazel-bin/_solib_k8/_U@rules_Uhaskell_S_Shaskell_Ctoolchain-libraries___Uexternal_Srules_Uhaskell_Ughc_Ulinux_Uamd64_Slib_Sghc-prim-0.5.3:
libHSghc-prim-0.5.3-ghc8.6.5.so

bazel-bin/_solib_k8/_U@rules_Uhaskell_S_Shaskell_Ctoolchain-libraries___Uexternal_Srules_Uhaskell_Ughc_Ulinux_Uamd64_Slib_Sinteger-gmp-1.0.2.0:
libHSinteger-gmp-1.0.2.0-ghc8.6.5.so

bazel-bin/_solib_k8/_U@rules_Uhaskell_S_Shaskell_Ctoolchain-libraries___Uexternal_Srules_Uhaskell_Ughc_Ulinux_Uamd64_Slib_Srts:
libffi.so  libHSrts-ghc8.6.5.so

bazel-bin/_solib_k8/_U_S_Stests_Sbinary-with-lib-dynamic_Clib___Utests_Sbinary-with-lib-dynamic:
libHStestsZSbinary-with-lib-dynamicZSlib-ghc8.6.5.so

By default Bazel produces very long directory names, and separate directories for most libraries. On macOS, the standard GHC bindist requires dynamic linking as part of compilation, each of these long paths will have to appear in the MACH-O header as a runpath or load command. This can very quickly add up to exceed the MACH-O header size limit. The header may not exceed 32k. With paths like the above we have frequently had issues with builds failing due to MACH-O header size at around 200 library dependencies. Currrently, rules_haskell employs a number of workarounds to avoid the issue. Shortening the symlink paths as with this PR would make our lives as rules_haskell maintainers easier, and also raise the upper bound for the number of possible library dependencies.

For core Haskell libraries as well as haskell_library targets, the directory prefix is unnecessary. The first few libraries are core libraries shipped with the compiler toolchain. Their paths look like this:

_U@rules_Uhaskell_S_Shaskell_Ctoolchain-libraries___Uexternal_Srules_Uhaskell_Ughc_Ulinux_Uamd64_Slib_Sbase-4.12.0.0/libHSbase-4.12.0.0-ghc8.6.5.so

and should instead look like this

libHSbase-4.12.0.0-ghc8.6.5.so

The library name contains the name and version of the corresponding core package and the compiler version. This makes accidental name collisions very, very unlikely.

The last library comes from a local haskell_library target. Its path looks like this:

_U_S_Stests_Sbinary-with-lib-dynamic_Clib___Utests_Sbinary-with-lib-dynamic/libHStestsZSbinary-with-lib-dynamicZSlib-ghc8.6.5.so

and should instead look like this

libHStestsZSbinary-with-lib-dynamicZSlib-ghc8.6.5.so

The library name already contains the full package path and is thereby unique. The libHS prefix is required by the Haskell compiler, same as the compiler version suffix.

Regarding the test, I would prefer if you remove the shell test altogether and add a new test here:
https://github.com/bazelbuild/bazel/blob/master/src/test/java/com/google/devtools/build/lib/rules/cpp/SkylarkCcCommonTest.java

Right, I'm not too familiar with the internal Java API. If I understand correctly, I'd generate the necessary files similar to how it's done in setUpCcLinkingContextTest and then test the symlinks similar to how it's done here. Correct?

Thanks for the review @oquenchil

aherrmann-da pushed a commit to digital-asset-archive/bazel that referenced this pull request Aug 22, 2019
Adds an overload of `getDynamicLibrarySymlink` that takes the relative
path of the symlink to be created underneath the solib directory.

Addresses review comment
bazelbuild#8888 (comment)
aherrmann-da pushed a commit to digital-asset-archive/bazel that referenced this pull request Aug 22, 2019
By default the Starlark function `create_library_to_link` would always
preserve the library name and create a subdirectory underneath
`solibDir` to place the library symlink.

This commit allows to override this behavior by specifying a path
fragment where to place the libray symlink underneath the solib dir.

Addresses review comment
bazelbuild#8888 (comment)
aherrmann-da pushed a commit to digital-asset-archive/bazel that referenced this pull request Aug 22, 2019
Add tests for custom solib link paths in create_library_to_link.

Addresses review comment
bazelbuild#8888 (comment)
@aherrmann aherrmann force-pushed the create_library_to_link branch from f85baf2 to 4de84d0 Compare August 22, 2019 12:45
@aherrmann aherrmann requested a review from scentini as a code owner August 22, 2019 12:45
@aherrmann
Copy link
Copy Markdown
Contributor Author

@oquenchil I've addressed your comments. create_library_to_link now takes optional arguments dynamic_library_path and interface_library_path. If set, these override the path of the symbolic link underneath the solib directory. I've also removed the shell integration tests and instead added Java tests in SkylarkCcCommonTest.java as requested. PTAL.

@oquenchil
Copy link
Copy Markdown
Contributor

Yes, that looks perfect. One little nit though. The dynamic_library_path and interface_library_path are changing the paths only for the symlinks correct? Ideally, these two names would make that clear, so something like dynamic_library_symlink_path and interface_library_symlink_path.

Otherwise, I think it might cause confusion with people thinking that the path of the actual dynamic_library and interface_library gets replaced.

Also initially I wanted to complain about symlinks being an internal optimization detail which is platform dependent and shouldn't be leaked. But this is already leaking through the resolved_symlink_dynamic_library and resolved_symlink_dynamic_library getters, so no issue here.

The PR looks perfect, the test is exactly how it should be and I appreciate you took good care of limiting the filetypes for these two parameters. Just renaming the two input parameters to make clear that it's for the symlink and we can merge this.

aherrmann-da pushed a commit to digital-asset-archive/bazel that referenced this pull request Aug 22, 2019
@aherrmann
Copy link
Copy Markdown
Contributor Author

@oquenchil Thank you, that's great to hear!

The dynamic_library_path and interface_library_path are changing the paths only for the symlinks correct? Ideally, these two names would make that clear, so something like dynamic_library_symlink_path and interface_library_symlink_path.

Yes, that is correct. Agreed, makes sense to be more explicit in those names. I've changed the names as suggested. PTAL

Copy link
Copy Markdown
Contributor

@oquenchil oquenchil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Param(
name = "dynamic_library_symlink_path",
doc =
"Override the default path of the dynamic library link in the solib directory.",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Path is an actual Starlark type -- lets make this clear this should be a string instead of a path type? maybe "path string" ?
(Same below)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've amended the docstring following #8888 (comment). That should clarify this a bit more. Also, the html documentation will display the type String above that description, similar to here.

named = true,
noneable = true,
type = String.class,
defaultValue = "None"),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a possibility empty-string is a meaningful path here? Otherwise, consider using empty string as the default value, and making the parameter type an actual String instead of an Object.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right that an empty string is not meaningful here. I was trying to follow the convention of None meaning "use default behavior". But, it looks like for strings "" is indeed used for that purpose, e.g. in repository_ctx.execute(..., working_directory = ""), so this change seems alright. The advantage is that it makes it easier to forward a default value from a rule string attribute, like here.

aherrmann-da pushed a commit to digital-asset-archive/bazel that referenced this pull request Aug 28, 2019
Copy link
Copy Markdown
Contributor Author

@aherrmann aherrmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@c-parsons I've addressed your comments. PTAL

@Param(
name = "dynamic_library_symlink_path",
doc =
"Override the default path of the dynamic library link in the solib directory.",
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've amended the docstring following #8888 (comment). That should clarify this a bit more. Also, the html documentation will display the type String above that description, similar to here.

named = true,
noneable = true,
type = String.class,
defaultValue = "None"),
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right that an empty string is not meaningful here. I was trying to follow the convention of None meaning "use default behavior". But, it looks like for strings "" is indeed used for that purpose, e.g. in repository_ctx.execute(..., working_directory = ""), so this change seems alright. The advantage is that it makes it easier to forward a default value from a rule string attribute, like here.

@aherrmann
Copy link
Copy Markdown
Contributor Author

@c-parsons @oquenchil Can this be merged?

Copy link
Copy Markdown
Contributor

@c-parsons c-parsons left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry this review fell off my radar In general, this looks good to me. But I'd like to get approval from oquenchil@ before proceeding, as he has more knowledge of this piece of the API.

PathFragment dynamicLibraryPathFragment = null;
if (dynamicLibraryPath != null && !dynamicLibraryPath.isEmpty()) {
dynamicLibraryPathFragment = PathFragment.create(dynamicLibraryPath);
if (dynamicLibraryPathFragment.isEmpty() || dynamicLibraryPathFragment.isAbsolute() || dynamicLibraryPathFragment.containsUplevelReferences()) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to check dynamicLibraryPathFragment.isEmpty(), given that we already checked dynamicLibraryPath.isEmpty above?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. PathFragment.create performs normalization if necessary. The docs don't seem to guarantee that normalization never returns an empty PathFragment given a non-empty String. E.g. ./ or foo/.. could potentially be normalized away. For reference, the docs say

 * <p>Strings are normalized with '.' and '..' removed and resolved (if possible), any multiple
 * slashes ('/') removed, and any trailing slash also removed. Windows drive letters are uppercased.
 * The current implementation does not touch the incoming path string unless the string actually
 * needs to be normalized.

assertThat(
librariesToLink.toCollection(LibraryToLink.class).stream()
.filter(x -> x.getDynamicLibrary() != null)
.map(x -> x.getDynamicLibrary().getRootRelativePath().relativeTo(toolchain.getSolibDirectory()).toString())
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can you refactor x.getDynamicLibrary().getRootRelativePath().relativeTo(toolchain.getSolibDirectory()).toString() to a helper private static method, to make this slightly more readable? (the combination of chained calls and use of stream() makes this difficult to read)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've factored it out as getSolibRelativePath.

aherrmann added a commit to aherrmann/bazel that referenced this pull request Sep 30, 2019
@oquenchil
Copy link
Copy Markdown
Contributor

Sorry this review fell off my radar In general, this looks good to me. But I'd like to get approval from oquenchil@ before proceeding, as he has more knowledge of this piece of the API.

Yes, still looks good to me.

@mjrussell
Copy link
Copy Markdown

Im running into MACH-O limits in rules_haskell and would greatly appreciate this getting merged. Is there anything holding this up from merging?

@aherrmann
Copy link
Copy Markdown
Contributor Author

@c-parsons I think this is ready for merge. Let me know if there's something else you'd like to see changed in this PR.

@mjrussell Thanks for pinging. Regarding MACH-O limits, which version of rules_haskell are you on? Version 0.11 contains some workarounds that should avoid most MACH-O header size issues. If you're still hitting these issues on 0.11, could you open an issue at https://github.com/tweag/rules_haskell with some more details?

Adds an overload of `getDynamicLibrarySymlink` that takes the relative
path of the symlink to be created underneath the solib directory.

Addresses review comment
bazelbuild#8888 (comment)
By default the Starlark function `create_library_to_link` would always
preserve the library name and create a subdirectory underneath
`solibDir` to place the library symlink.

This commit allows to override this behavior by specifying a path
fragment where to place the libray symlink underneath the solib dir.

Addresses review comment
bazelbuild#8888 (comment)
Add tests for custom solib link paths in create_library_to_link.

Addresses review comment
bazelbuild#8888 (comment)
@aherrmann aherrmann force-pushed the create_library_to_link branch from 85c0a8e to 43dc9a7 Compare November 21, 2019 14:07
@aherrmann
Copy link
Copy Markdown
Contributor Author

@c-parsons I've addressed your comments. PTAL.


Artifact notNullArtifactForIdentifier = null;
StringBuilder extensionErrorsBuilder = new StringBuilder();
String extensionErrorMessage = "does not have any of the allowed extensions";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dead code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's still used in the validation logic for the static and dynamic library attributes. E.g. here. Sadly, it doesn't match so well with the symlink validation and refactoring that code too much seems out of scope for this PR.

}

private static void validateSymlinkPath(Location location, String attrName, PathFragment symlinkPath, FileTypeSet filetypes, StringBuilder errorsBuilder) throws EvalException {
String extensionErrorMessage = "does not have any of the allowed extensions";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please just inline this string

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Copy Markdown
Contributor

@c-parsons c-parsons left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will merge shortly

@aherrmann
Copy link
Copy Markdown
Contributor Author

Will merge shortly

Awesome, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla: yes team-Rules-CPP Issues for C++ rules

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Allow to override preserveName and prefixConsumer in create_library_to_link

9 participants