-
-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Description
I’ve tried https://github.com/dtolnay/cargo-llvm-lines on our script crate. From their README:
Count the number of lines of LLVM IR across all instantiations of a generic function. Based on a suggestion from @eddyb on how to count monomorphized functions in order to debug compiler memory usage, executable size and compile time.
<eddyb> unoptimized LLVM IR
<eddyb> first used grep '^define' to get only the lines defining function bodies
<eddyb> then regex replace in my editor to remove everything before @ and everything after (
<eddyb> then sort | uniq -c
This tools wants to run cargo rustc itself, which doesn’t play well with mach. To get it to run I needed to:
- Upgrade rustc, CC "thin LTO buffers without LLVM's NameAnonGlobals pass" dtolnay/cargo-llvm-lines#17
- Temporarily apply this diff:
--- components/script/Cargo.toml +++ components/script/Cargo.toml @@ -105,3 +105,3 @@ smallvec = { version = "0.6", features = ["std", "union"] } sparkle = "0.1" -style = { path = "../style", features = ["servo"] } +style = { path = "../style", features = ["servo", "servo-layout-2013"] } style_traits = { path = "../style_traits" }
cdintocomponents/script(the tool does not support workspaces or a--packageflag)- Then run
cargo llvm-lines
This will likely compile from scratch, since RUSTFLAGS config that mach sets is missing.
The output is very long, here is only functions that each contribute at least 1% of all lines of LLVM IR in libscript. The second column also shows the number of times a function is instanciated from generic parameters. Both of these numbers likely influence compile times.
Lines Copies Function name
----- ------ -------------
10206761 (100%) 277692 (100%) (TOTAL)
762368 (7.5%) 5956 (2.1%) mozjs::panic::wrap_panic::{{closure}}
594917 (5.8%) 7334 (2.6%) std::thread::local::LocalKey<T>::try_with
572728 (5.6%) 5956 (2.1%) mozjs::panic::wrap_panic
332398 (3.3%) 5963 (2.1%) std::panicking::try
196779 (1.9%) 5963 (2.1%) std::panicking::try::do_catch
191026 (1.9%) 15098 (5.4%) core::ptr::drop_in_place
159092 (1.6%) 5963 (2.1%) std::panicking::try::do_call
148371 (1.5%) 6508 (2.3%) core::ops::function::FnOnce::call_once
127032 (1.2%) 6956 (2.5%) core::ptr::read
119928 (1.2%) 996 (0.4%) <<&mut bincode::de::Deserializer<R,O> as serde::de::Deserializer>::deserialize_tuple::Access<R,O> as serde::de::SeqAccess>::next_element_seed
104441 (1.0%) 6967 (2.5%) core::intrinsics::copy_nonoverlapping
102051 (1.0%) 7137 (2.6%) std::thread::local::LocalKey<T>::with
mozjs::panic alone accounts for 1.3 million lines, 13% of the crate. It’s instantiated 5956 times, most of them in code generated by DOM bindings codegen. Each instance calls std::panic::catch_unwind and methods of a thread_local!, which accounts of most instances of functions from std::panicking and std::tread::local. Adding those brings the number to about 2.6 million lines, or 25%!
Now, mozjs::panic::wrap_panic takes an FnOnce parameter which is usually a closure that presumably gets inlined. So those million lines are split between:
- The inlined closures, different for each call site
mozjs::panicitself, effectively duplicated
It’s hard to tell how the amount is distributed, but maybe it’s worth trying to make things less generic and less inlined to reduce that duplication. Which would hopefully reduce compile times without impacting runtime perf significantly. Or maybe it’ll turn out to be mostly the different closures, which would mean that 25% of the generated code in libscript is inlined DOM API implementations, which is not particularly surprising or actionable.