Skip to content

Conversation

@DaanDeMeyer
Copy link
Collaborator

@DaanDeMeyer DaanDeMeyer commented May 5, 2025

If we look at some profiling data of our build (using ClangBuildAnalyzer), we can see that most of the time is spent parsing source code:

**** Time summary:
Compilation (1572 times):
  Parsing (frontend):          322.4 s
  Codegen & opts (backend):     51.1 s

One reason why we spend so much time parsing source code is that our headers transitively include many other headers, even if the included header is only tangentially relevant to the current header file. A good example is strv.h including hashmap.h because it has 4 functions that take a hashmap as input.

This results in the compiler having to do duplicated work for every single source file parsing the same headers over and over again, even if only a small amount of the included declarations are actually used.

This PR removes transitive includes from various core headers and splits of a few new headers to try and reduce the work the amount of declarations the compiler has to parse for each source file. The general idea is to replace macros and static inline functions with functions defined in the corresponding source file which means all the includes used to implement that function can be moved to the source file as well.

This gives us the following results after applying all these commits:

**** Time summary:
Compilation (1575 times):
  Parsing (frontend):          236.1 s
  Codegen & opts (backend):     50.3 s

This work also improves the effectiveness of incremental compilation as when a header is modified, all files that (transitively) include it are recompiled, so by reducing the amount of (transitively) included headers, we reduce the amount of files that need to be recompiled when headers are changed as well.

Note that there's more work to be done. Specifically, stdlib.h is still transitively included into every source file and probably doesn't need to be.

The current list of expensive headers with all these commits applied is as follows:

16983 ms: ../src/shared/tests.h (included 341 times, avg 49 ms), included via:
  341x: <direct include>

13636 ms: ../src/basic/memory-util.h (included 1424 times, avg 9 ms), included via:
  264x: fd-util.h 
  108x: alloca-util.h 
  75x: tests.h process-util.h alloca-util.h 
  56x: conf-parser.h 
  46x: bus-util.h pidref.h 
  40x: ansi-color.h terminal-util.h pidref.h 
  ...

13375 ms: ../src/basic/process-util.h (included 564 times, avg 23 ms), included via:
  295x: tests.h 
  268x: <direct include>
  1x: raw-clone.h 

12826 ms: ../src/systemd/sd-bus.h (included 507 times, avg 25 ms), included via:
  86x: <direct include>
  50x: bus-error.h 
  35x: bus-common-errors.h bus-error.h 
  30x: analyze.h analyze-verify-util.h execute.h bus-unit-util.h 
  28x: networkd-link.h 
  24x: networkd-address.h networkd-link.h 
  ...

11129 ms: ../src/basic/errno-util.h (included 1088 times, avg 10 ms), included via:
  226x: tests.h 
  220x: <direct include>
  76x: bus-error.h 
  65x: netlink-util.h socket-util.h 
  48x: bus-common-errors.h bus-error.h 
  47x: socket-util.h 
  ...

10765 ms: ../src/basic/string-util.h (included 1397 times, avg 7 ms), included via:
  198x: <direct include>
  96x: tests.h process-util.h 
  90x: process-util.h 
  59x: escape.h 
  56x: bus-util.h 
  54x: chase.h stat-util.h siphash24.h 
  ...

10152 ms: ../src/basic/hashmap.h (included 696 times, avg 14 ms), included via:
  87x: <direct include>
  42x: set.h 
  33x: analyze.h analyze-verify-util.h execute.h ordered-set.h 
  25x: resolved-dns-answer.h ordered-set.h 
  20x: udev-builtin.h udev-event.h 
  20x: networkd-link.h networkd-util.h 
  ...

10030 ms: ../src/systemd/sd-event.h (included 855 times, avg 11 ms), included via:
  86x: sd-bus.h 
  50x: sd-device.h 
  45x: bus-error.h sd-bus.h 
  32x: <direct include>
  31x: bus-common-errors.h bus-error.h sd-bus.h 
  30x: analyze.h analyze-verify-util.h execute.h bus-unit-util.h sd-bus.h 
  ...

9919 ms: ../src/shared/openssl-util.h (included 57 times, avg 174 ms), included via:
  23x: <direct include>
  12x: tpm2-util.h 
  6x: bootctl.h 
  4x: pull-common.h pull-job.h 
  3x: pe-binary.h 
  2x: cryptenroll-tpm2.h tpm2-util.h 
  ...

9665 ms: /usr/include/stdlib.h (included 1433 times, avg 6 ms), included via:
  219x: log.h 
  176x: <direct include>
  136x: errno-util.h 
  123x: tests.h errno-util.h 
  66x: bus-error.h errno-util.h 
  65x: conf-parser.h log.h 
  ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

4 participants