Skip to content

[build] Add build timing report and dependency analysis tools#25643

Merged
lihuay merged 2 commits intosonic-net:masterfrom
rustiqly:feat/build-timing-report
Mar 5, 2026
Merged

[build] Add build timing report and dependency analysis tools#25643
lihuay merged 2 commits intosonic-net:masterfrom
rustiqly:feat/build-timing-report

Conversation

@rustiqly
Copy link
Copy Markdown
Contributor

What I did

Add three build instrumentation scripts and a make build-report target for analyzing SONiC build performance.

Scripts

scripts/build-timing-report.sh
Parses per-package timing from build logs (HEADER/FOOTER timestamps in target/*.log). Generates:

  • Top 30 slowest packages sorted by duration
  • Phase breakdown (bookworm/trixie/docker/wheels) with parallelism efficiency
  • Parallelism timeline (concurrent builds over time)
  • Summary stats (total CPU vs wall time, max concurrency)
  • CSV export for further analysis

scripts/build-dep-graph.py
Parses all rules/*.mk files to extract the dependency graph (_DEPENDS, _AFTER, _RDEPENDS, etc.). Generates:

  • Critical path analysis (longest dependency chain)
  • Fan-out/fan-in bottleneck identification
  • Root/leaf package counts (parallelism ceiling)
  • DOT graph for Graphviz visualization
  • JSON adjacency list

scripts/build-resource-monitor.sh
Samples system resources during builds (CPU, memory, disk I/O, Docker containers) at configurable intervals. Outputs CSV for correlation with build timeline.

Make target

make build-report runs timing + dependency analysis after a build completes.

Why I did it

Build performance optimization requires measurement. Currently there is no aggregated view of per-package timing or dependency bottlenecks. These tools provide the baseline data needed to identify and prioritize optimization opportunities.

Example findings from a VS build (24-core, 30GB RAM, JOBS=4):

  • 210 packages built in 53m wall (173m CPU time)
  • Max concurrency: 5 — significant room for improvement
  • Critical path: 14 packages deep (libnl → libswsscommon → utilities)
  • Top bottleneck: LIBSWSSCOMMON with 48 downstream dependents
  • Parallelism drops to 0-1 during libswsscommon → sairedis → swss serialization

How I verified it

Ran all three scripts against a successful nightly VS build:

$ bash scripts/build-timing-report.sh ./target
Total packages with timing data: 210
Top slowest: linux-headers (16m), p4lang-p4c (14m), dhcp4relay (12m)
Parallelism ratio: 322% (CPU/wall)

$ python3 scripts/build-dep-graph.py .
Total packages: 284, Dependency edges: 520
Critical path length: 14 packages
LIBSWSSCOMMON fan-out: 48

$ bash scripts/build-resource-monitor.sh 10 /tmp/test.csv
# Successfully samples CPU/mem/disk every 10s

@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds build performance analysis and instrumentation tools to sonic-buildimage. The PR introduces three new scripts and a Make target to help developers identify build bottlenecks and optimization opportunities.

Changes:

  • Added scripts/build-timing-report.sh to parse build logs and generate per-package timing analysis with parallelism statistics
  • Added scripts/build-dep-graph.py to parse dependency relationships from Makefile rules and generate critical path analysis
  • Added scripts/build-resource-monitor.sh to sample system resources during builds
  • Added make build-report target in slave.mk to run timing and dependency analysis

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 22 comments.

File Description
slave.mk Added build-report Makefile target and updated .PHONY declarations
scripts/build-timing-report.sh Shell script that parses build log timestamps to generate timing reports
scripts/build-resource-monitor.sh Shell script that samples CPU, memory, and disk I/O metrics during builds
scripts/build-dep-graph.py Python script that extracts and analyzes the build dependency graph from .mk files

@rustiqly rustiqly force-pushed the feat/build-timing-report branch 2 times, most recently from 6ad8bf2 to f4c184b Compare February 25, 2026 15:01
@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Copy Markdown
Contributor

@yxieca yxieca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. AI agent on behalf of Ying.

yxieca
yxieca previously approved these changes Feb 28, 2026
Copy link
Copy Markdown
Contributor

@yxieca yxieca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. AI agent on behalf of Ying.

@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@rustiqly
Copy link
Copy Markdown
Contributor Author

Copilot Review — All 22 Threads Addressed

Pushed commit 4733d4e with fixes for all valid issues and resolved all threads.

Code Fixes (17):

  1. free -g roundingfree -m with /1024 division for sub-GB precision
  2. Missing = regex → added plain = to Makefile dep patterns
  3. CPU % cumulative → delta between two /proc/stat reads 1s apart
  4. "Critical path estimate" comment → updated to "Build parallelism analysis"
  5. 60s vs 10s mismatch → fixed section comment to say "60-second samples"
  6. Fan stats missing after-only pkgscompute_fan_stats now includes after relationships
  7. Missing ?= regex → added ? to all regex character classes
  8. Disk I/O div-by-zero → guard on INTERVAL > 0
  9. elapsed_line unused → removed
  10. Redundant LIBSWSSCOMMON_DBG → removed (caught by _DBG suffix)
  11. active_make_jobs in header → removed from comment
  12. _RDEPENDS not used → now populates rdeps dict
  13. if v filter on rdeps → removed unnecessary filter
  14. /proc/stat idle field → new delta code handles all 7 fields explicitly
  15. REPORT_FORMAT unused → removed parameter
  16. Critical path cycles → added in_progress set for cycle detection
  17. Execute permission check → added chmod +x loop for companion scripts

Not Real Issues (5) — Resolved with explanation:

  1. date -d safety — build logs are trusted, not user input
  2. CSV commas — SONiC packages follow Debian naming, no commas
  3. Path traversal — build system controls target/ paths
  4. DOT red/white — standard graphviz convention, renders fine
  5. if v redundant — harmless defensive filter (removed anyway in sync submodules (linux-kernel, sairedis, swss, and swss-common to latest master) #13)

@yxieca
Copy link
Copy Markdown
Contributor

yxieca commented Mar 1, 2026

AI agent on behalf of Ying: CI is still running; will re-check and approve once green.

@yxieca
Copy link
Copy Markdown
Contributor

yxieca commented Mar 1, 2026

AI agent on behalf of Ying: CI not clean yet (Azure.sonic-buildimage still in progress; optional kvmtest-t1-lag-vpp failed). Please update once green.

@yxieca
Copy link
Copy Markdown
Contributor

yxieca commented Mar 1, 2026

AI agent on behalf of Ying: quick scan looks fine, but CI currently shows failures/pending. Please check the failing jobs and rerun; I’ll re-review once green.

@yxieca
Copy link
Copy Markdown
Contributor

yxieca commented Mar 1, 2026

CI shows failures (Azure/impacted-area tests). Please re-run or confirm if infra flake. Once green, I can approve.\n\nAI agent on behalf of Ying.

@yxieca
Copy link
Copy Markdown
Contributor

yxieca commented Mar 1, 2026

AI agent on behalf of Ying: quick check shows outstanding issues.

  • CI failing: Azure.sonic-buildimage (Test kvmtest-t1-lag-vpp by Elastictest [OPTIONAL]).
    Please address/re-run; I’ll re-check.

Copy link
Copy Markdown
Contributor

@yxieca yxieca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI agent on behalf of Ying. Quick review: [build]. No issues found.

Copy link
Copy Markdown
Contributor

@yxieca yxieca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI agent on behalf of Ying. Quick review: [build] Add build timing report and dependency analysis tools. No issues found.

@rustiqly rustiqly force-pushed the feat/build-timing-report branch from 4733d4e to 6d1c672 Compare March 2, 2026 15:01
Copy link
Copy Markdown
Contributor

@yxieca yxieca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI agent on behalf of Ying.

@rustiqly rustiqly force-pushed the feat/build-timing-report branch from 6d1c672 to 8111d36 Compare March 3, 2026 15:02
@yxieca
Copy link
Copy Markdown
Contributor

yxieca commented Mar 3, 2026

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

rustiqly added 2 commits March 4, 2026 07:04
Add three scripts for build performance instrumentation:

- scripts/build-timing-report.sh: Parse per-package timing from build
  logs (HEADER/FOOTER timestamps), generate sorted duration table,
  phase breakdown, parallelism timeline, and CSV export.

- scripts/build-dep-graph.py: Parse rules/*.mk dependency graph,
  compute critical path, fan-out/fan-in bottleneck analysis, and
  generate DOT/JSON output for visualization.

- scripts/build-resource-monitor.sh: Sample CPU, memory, disk I/O,
  and Docker container count during builds for resource utilization
  analysis.

Add "make build-report" target to slave.mk that runs the timing
report and dependency analysis after a build completes.

Example output from a VS build on 24-core/30GB machine:
- 210 packages built in 53m wall time (173m CPU)
- Max concurrency: 5 (with SONIC_CONFIG_BUILD_JOBS=4)
- Critical path: 14 packages deep (libnl -> libswsscommon -> utilities)
- Top bottleneck: LIBSWSSCOMMON with 48 downstream dependents

Signed-off-by: Rustiqly <rustiqly@users.noreply.github.com>
- Use free -m with division instead of free -g to avoid rounding (#1)
- Add = and ?= to Makefile dependency regex patterns (sonic-net#2, sonic-net#7)
- CPU calculation now uses /proc/stat delta (two reads) (sonic-net#3, sonic-net#14)
- Fix misleading 'critical path estimate' comment (sonic-net#4)
- Fix parallelism timeline comment (60s not 10s) (sonic-net#5)
- Include after-relationship packages in fan stats (sonic-net#6)
- Guard disk I/O division by zero when INTERVAL<=1 (sonic-net#8)
- Remove unused elapsed_line variable (sonic-net#9)
- Remove redundant LIBSWSSCOMMON_DBG check (sonic-net#10)
- Remove active_make_jobs from CSV header comment (sonic-net#11)
- Wire up _RDEPENDS parsing to build reverse deps (sonic-net#12)
- Remove unnecessary 'if v' filter on rdeps JSON (sonic-net#13)
- Remove unused REPORT_FORMAT parameter (sonic-net#15)
- Add cycle detection to critical path algorithm (sonic-net#16)
- Add execute permission check for companion scripts (sonic-net#17)

Signed-off-by: Rustiqly <rustiqly@users.noreply.github.com>
@rustiqly rustiqly force-pushed the feat/build-timing-report branch from 8111d36 to da18f0d Compare March 4, 2026 15:04
@lihuay
Copy link
Copy Markdown
Contributor

lihuay commented Mar 4, 2026

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@lihuay lihuay merged commit 31bdc3f into sonic-net:master Mar 5, 2026
25 of 26 checks passed
FengPan-Frank pushed a commit to FengPan-Frank/sonic-buildimage that referenced this pull request Mar 6, 2026
…net#25643)

* [build] Add build timing report and dependency analysis tools

Add three scripts for build performance instrumentation:

- scripts/build-timing-report.sh: Parse per-package timing from build
  logs (HEADER/FOOTER timestamps), generate sorted duration table,
  phase breakdown, parallelism timeline, and CSV export.

- scripts/build-dep-graph.py: Parse rules/*.mk dependency graph,
  compute critical path, fan-out/fan-in bottleneck analysis, and
  generate DOT/JSON output for visualization.

- scripts/build-resource-monitor.sh: Sample CPU, memory, disk I/O,
  and Docker container count during builds for resource utilization
  analysis.

Add "make build-report" target to slave.mk that runs the timing
report and dependency analysis after a build completes.

Example output from a VS build on 24-core/30GB machine:
- 210 packages built in 53m wall time (173m CPU)
- Max concurrency: 5 (with SONIC_CONFIG_BUILD_JOBS=4)
- Critical path: 14 packages deep (libnl -> libswsscommon -> utilities)
- Top bottleneck: LIBSWSSCOMMON with 48 downstream dependents

Signed-off-by: Rustiqly <rustiqly@users.noreply.github.com>

* Address Copilot review: fix 17 bugs in build analysis scripts

- Use free -m with division instead of free -g to avoid rounding (sonic-net#1)
- Add = and ?= to Makefile dependency regex patterns (sonic-net#2, sonic-net#7)
- CPU calculation now uses /proc/stat delta (two reads) (sonic-net#3, sonic-net#14)
- Fix misleading 'critical path estimate' comment (sonic-net#4)
- Fix parallelism timeline comment (60s not 10s) (sonic-net#5)
- Include after-relationship packages in fan stats (sonic-net#6)
- Guard disk I/O division by zero when INTERVAL<=1 (sonic-net#8)
- Remove unused elapsed_line variable (sonic-net#9)
- Remove redundant LIBSWSSCOMMON_DBG check (sonic-net#10)
- Remove active_make_jobs from CSV header comment (sonic-net#11)
- Wire up _RDEPENDS parsing to build reverse deps (sonic-net#12)
- Remove unnecessary 'if v' filter on rdeps JSON (sonic-net#13)
- Remove unused REPORT_FORMAT parameter (sonic-net#15)
- Add cycle detection to critical path algorithm (sonic-net#16)
- Add execute permission check for companion scripts (sonic-net#17)

Signed-off-by: Rustiqly <rustiqly@users.noreply.github.com>

---------

Signed-off-by: Rustiqly <rustiqly@users.noreply.github.com>
Co-authored-by: Rustiqly <rustiqly@users.noreply.github.com>
Signed-off-by: Feng Pan <fenpan@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants