Cargo once by sporksmith · Pull Request #2404 · shadow/shadow

sporksmith · 2022-09-06T12:59:40Z

I was thinking some more about the coverage breakage - I think it actually highlights a bigger problem that our hybrid cargo/cmake build ends up building some rust code multiple times. It usually isn't a big deal and works out ok in the end, but could end up causing subtle problems.

In any given cargo invocation, it (re-)builds all dependencies in that invocation's target tree. The generated artifacts have mangled names that, IIUC, incorporate the version of the compiler, the version of the dependency, what features are enabled, and extra flags passed through env variables, etc.
For our Rust crates that link to C libraries, we pass extra flags via env variables for those to link correctly. When multiple crates with the same dependency are compiled, they end up generating different versions of that compiled dependency, since the flags were different. (Even though the flags being passed, -L, and -l shouldn't actually change the final static libraries)
Usually the linker manages to deduplicate correctly in the final link step, but it's not in the coverage build. (I have some ideas as to why, elided for relative brevity)
I think there could end up being other subtle issues e.g. if a multiply-compiled-and-linked dependency does end up in the final shadow executable or shadow-shim library multiple times, particularly if it uses any globals (such as thread-locals).

I think this problem all goes away if we build all of our Rust code in a single cargo invocation. In that case, everything is built with the same cargo env variables, getting rid of some of the false duplication.

For that to work, we need to move the knowledge of what linker flags to use for which crates from the cargo invocations in cmake, into cargo build scripts for each crate.

Right now that would be tricky, since cmake is responsible for building some of the C libraries. It's hard to "hard code" the right linker flags in the Rust build scripts.

We can fix that by wrapping each C library in a Rust crate. That moves the responsibility for managing and propagating linker flags from cmake to cargo.

So that's what I'm doing here. This PR currently:

Transforms each C library being linked into shadow or libshadow-shim.so into a Rust crate. Currently these are just ~empty Rust libraries, with a build script describing how to compile the C files. This also has the benefit of putting us in a position where we could incrementally start moving code from C to Rust within each of these libraries.
From cmake only makes a single cargo build invocation to build all of the relevant Rust code. In addition to avoiding false duplication due to different env variables, this allows the whole Rust build to use a single target directory, which should allow it do better caching/reuse.
I've also (partly) moved the bindgen invocations from cmake into the Rust build scripts. They are kept optional by putting them behind an optional "feature". This is semi-orthogonal, but makes a little more sense since we're already using cargo build scripts at this point, and moves further to driving the build of shadow itself from cargo instead of cmake.

I'm still doing some cleanup. In particular:

One downside is that the final link steps for shadow and the shim are still in cmake, and currently the knowledge of exactly which libraries need to be included in that link is hard-coded. I don't see an easy way to extract it from cargo. It's not that bad, but I'm going to try moving these to Rust as well, which should resolve it.
The cbindgen invocations are still in cmake. Might as well move those into the build script too if we're moving the bindgen invocations there.

If we do both of those, pretty much the whole shadow build will be managed by Cargo, and we'll just be using cmake to invoke cargo and compile and run the tests.

I started thinking more about whether we could do 1 big cargo build instead of using it to build i was curious about how hard it is to have Cargo compile C code for you; it turns out it's actually pretty nice. it ended up spiraling into moving most of the shadow build from cmake to cargo

sporksmith · 2022-09-09T20:47:52Z

Ok, I think this is ready. Sorry most of it is in one giant commit. I'd created a lot of smaller commits as I went, but ended up going back and redoing and fixing things a lot. There's a lot of interaction between different parts, s.t. using fixup commits to try to create a clean history of small commits would have been a fair bit of work. I ended up just breaking off a few small independent commits instead and putting the rest in one big one.

sporksmith · 2022-09-09T20:59:37Z

Some ad-hoc benchmarks on my machine to validate things didn't get worse:

Clean release build: 55s -> 54s
Nop incremental release build: 4s -> 2s
touch thread.c; incremental release build: 4s -> 21s

Mostly ok; incremental builds get a bit slower. I think this is because the C code is now only incremental at the crate level. When any file inside a crate is touched, we always rebuild all the C files in that crate and relink (though I think the compiled rust files within the affected crate should still be reused)

src/lib/shadow-build-common/src/lib.rs

src/main/build.rs

src/lib/shadow-build-common/src/lib.rs

src/main/build.rs

.github/workflows/lint.yml

src/lib/logger/src/bindings.rs

src/lib/shadow-shim-helper-rs/Cargo.toml

src/lib/tsc/tests/tsc_test.rs

src/main/build.rs

sporksmith · 2022-09-27T15:08:01Z

In addition to now always recompiling the C code, we unfortunately lose cmake's generation of compile_commands.json, which vscode and other tools use for analysis. I've verified that it can be generated by running the whole build under bear, but haven't worked out how to get vscode to use the resulting file yet.

I poked around some more to see if we could solve the main problem without this drastic overhaul. I thought maybe using cargo rustc, which allows passing flags only to the "top level" without propagating to dependencies, (instead of cargo build) to compile the test binaries, and only passing the link flags there, might solve the problem of dependencies getting rebuilt with different build hashes, but it didn't. Unfortunately what cargo incorporates into the build hash is intentionally opaque, so I wasn't able to dig in further.

Ultimately I still think this PR is the best way forward.

This will be used by the build.rs scripts to build C and C++ code, using consistent flags and settings. Some flags specified in the cmake files are explicitly specified here, but we currently also accept some flags via env variables so that we don't need to implement all of the conditional logic yet.

The previous state of invoking Cargo multiple times meant that dependencies could be built multiple times, potentially with different compiler flags. Our coverage build had broken as a result of a dependency getting compiled and linked multiple times with conflicting symbols (the link needed mangled symbols from both instances, but they had duplicate non-mangled symbols). Invoking Cargo only once avoids this issue - everything is compiled with the same Cargo environment (including e.g. CFLAGS, etc). Previously this wasn't possible since crates that need to link with shadow C libraries need the respective link flags passed through the Cargo environment, and there isn't a way to specify such flags for individual crates from the command-line. We solve this by wrapping the C libraries in Rust crates, and building those with Cargo as well. We use Rust build scripts to build the C and C++ code. This way Cargo can manage the dependencies and corresponding link flags since everything is a Rust crate. While not strictly necessary as part of this change, we also move binding generation into the build scripts instead of using the command line tools via cmake. As before this is disabled by default; it is now enabled via a Cargo feature *bindings*. This lets us move more logic out of cmake, partly deduplicate the bindgen and cbindgen configs by adding utility libraries (instead of having toml files for each project), and lets Cargo and our Cargo.lock file manage what version of these tools is used. While debugging, I added some additional binding-generation for some of the wrapped C crates, thinking it was needed to get the C static libraries to link properly. It turned out not to be needed for that purpose, but they'll be useful as we start to transition some of those libraries into Rust.

Generating the bindings via cmake ensures we use the same compiler flags as when compiling the rest of the code. In particular this works around an issue where newer versions of clang warn about deprecated atomic APIs unless we've specified the C standard version.

The new glib compilation test seems to have triggered a warning in cmake suggesting that we explicitly opt into either the old or new behavior

Accidentally dropped running any rust unit tests in shadow#2404 We were also previously not running doc tests, and I think not running tests for any crates for which we don't build a `staticlib` for C code to link to. (I'm not sure whether there were any such tests before, but there increasingly will be) For simplicity and maintainability I just invoke `cargo test` now, instead of trying to directly run the test executables. While the cargo behavior of always rebuilding (if needed) may be slightly surprising, this approach should be more robust and maintainable.

Accidentally dropped running any rust unit tests in #2404 We were also previously not running doc tests, and I think not running tests for any crates for which we don't build a `staticlib` for C code to link to. (I'm not sure whether there were any such tests before, but there increasingly will be) For simplicity and maintainability I just invoke `cargo test` now, instead of trying to directly run the test executables. While the cargo behavior of always rebuilding (if needed) may be slightly surprising, this approach should be more robust and maintainable.

These were inadvertently dropped in shadow#2404.

github-actions bot assigned sporksmith Sep 6, 2022

sporksmith added this to the Convert existing modules to Rust milestone Sep 6, 2022

sporksmith mentioned this pull request Sep 9, 2022

Remove --profile option from setup script #2405

Closed

sporksmith force-pushed the cargo-once branch from c62bbc6 to 82f1fc4 Compare September 9, 2022 20:44

sporksmith marked this pull request as ready for review September 9, 2022 20:45

sporksmith requested a review from stevenengler September 9, 2022 20:45

stevenengler approved these changes Sep 14, 2022

View reviewed changes

sporksmith force-pushed the cargo-once branch from 2120802 to bcc0a36 Compare September 27, 2022 14:49

sporksmith added 4 commits September 27, 2022 13:13

Set C++ standard globally

34cf1e9

Move profiling cflag to root

6fc538f

logger: remove dependency on glib

4aaf2d9

sporksmith force-pushed the cargo-once branch from 4d98c1c to 993cf28 Compare September 27, 2022 18:14

sporksmith added 4 commits September 27, 2022 13:27

cmake: opt into CMP0075

3fc8735

The new glib compilation test seems to have triggered a warning in cmake suggesting that we explicitly opt into either the old or new behavior

coverage: enable executable mappings in docker

fcfaf88

sporksmith force-pushed the cargo-once branch from 993cf28 to fcfaf88 Compare September 27, 2022 18:27

sporksmith enabled auto-merge September 27, 2022 18:30

sporksmith merged commit a8edfbf into shadow:main Sep 27, 2022

sporksmith deleted the cargo-once branch September 27, 2022 18:44

stevenengler mentioned this pull request Sep 27, 2022

Shadow documentation CI workflow is broken #2419

Closed

sporksmith mentioned this pull request Sep 28, 2022

Run rust unit tests (including doc tests) #2421

Merged

sporksmith added a commit to sporksmith/shadow that referenced this pull request Sep 28, 2022

Re-enable coverage flags in Rust build

7eae96c

These were inadvertently dropped in shadow#2404.

sporksmith added a commit to sporksmith/shadow that referenced this pull request Sep 29, 2022

Re-enable coverage flags in Rust build

15e124b

These were inadvertently dropped in shadow#2404.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cargo once#2404

Cargo once#2404
sporksmith merged 8 commits intoshadow:mainfrom
sporksmith:cargo-once

sporksmith commented Sep 6, 2022 •

edited

Loading

Uh oh!

sporksmith commented Sep 9, 2022

Uh oh!

sporksmith commented Sep 9, 2022

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sporksmith commented Sep 27, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sporksmith commented Sep 6, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sporksmith commented Sep 9, 2022

Uh oh!

sporksmith commented Sep 9, 2022

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sporksmith commented Sep 27, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sporksmith commented Sep 6, 2022 •

edited

Loading