Skip to content

[nvidia-bluefield] enable component versions feature for bluefield dpu#1

Closed
tirupatihemanth wants to merge 1 commit intomasterfrom
htirupati/compvers
Closed

[nvidia-bluefield] enable component versions feature for bluefield dpu#1
tirupatihemanth wants to merge 1 commit intomasterfrom
htirupati/compvers

Conversation

@tirupatihemanth
Copy link
Copy Markdown
Owner

@tirupatihemanth tirupatihemanth commented Feb 28, 2025

Why I did it

To enable users to check the compiled versions of various sonic components and the current running versions on the bluefield dpu. Similar to component-version-listing PR

Work item tracking
  • Microsoft ADO (number only):

How I did it

  • I exported necessary environment variables for the compiled component versions from the corresponding Makefiles. I used a Makefile to write these into a new "component-versions" file.
  • I also converted existing get_component_versions.py script into a jinja2 template that works both for mellanox and bluefield-dpu platforms

How to verify it

  • cat /etc/mlnx/component-versions will give the compiled versions
  • get_component_versions.py will give a comparison between compiled versions and the currently running versions

Obtaining component-versions on a bluefield-dpu

root@sonic:/home/admin# get_component_versions.py
COMPONENT      COMPILATION       ACTUAL
-------------  ----------------  ----------------
SDK            25.1-RC8          1.5-1mlnx1
FW             43.1014           43.1014
SAI            SAIBuild0.0.39.0  SAIBuild0.0.39.0
HW_MANAGEMENT  N/A               N/A
MFT            4.28.0-96         4.28.0-96
KERNEL         6.1.0-22-2        6.1.0-22-2
BFSOC          4.10.0-13520      4.10.0-13520
ONIE           -                 N/A
SSD            -                 N/A
BIOS           -                 N/A
CPLD           -                 N/A
root@sonic:/home/admin# cat /etc/mlnx/component-versions
SDK 25.1-RC8
FW 43.1014
SAI SAIBuild0.0.39.0
BFSOC 4.10.0-13520
MFT 4.28.0-96
KERNEL 6.1.0-22-2

Obtaining component-versions on a switch

root@r-bobcat-03:/home/admin# cat /etc/mlnx/component-versions
SDK 4.7.2202
FW 2014.2202
SAI SAIBuild2411.2405.30.1
HW_MANAGEMENT 7.0040.2104
MFT 4.30.2-23
KERNEL 6.1.0-22-2
SIMX 25.1-1070
root@r-bobcat-03:/home/admin# get_component_versions.py
COMPONENT      COMPILATION             ACTUAL
-------------  ----------------------  ----------------------
SDK            4.7.2202                4.7.2202
FW             2014.2202               2014.2202
SAI            SAIBuild2411.2405.30.1  SAIBuild2411.2405.30.1
HW_MANAGEMENT  7.0040.2104             7.0040.2104
MFT            4.30.2-23               4.30.2-23
KERNEL         6.1.0-22-2              6.1.0-22-2
BFSOC          N/A                     N/A
ONIE           -                       2023.11-5.3.0012-9600
SSD            -                       CE00A400
BIOS           -                       0ACTV_00.00.019_9600
CPLD1          -                       CPLD000370_REV0104
CPLD2          -                       CPLD060387_REV0104
CPLD3          -                       CPLD000388_REV0100
DPU1_FPGA      -                       FPGA000375_REV0006
DPU2_FPGA      -                       FPGA000375_REV0006
DPU3_FPGA      -                       FPGA000375_REV0006
DPU4_FPGA      -                       FPGA000375_REV0006

Known Issues

  • The current version of sdk on the dpu is taken from the sdn-appliance field and is a known issue.
  • Fix Date: TBD

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106
  • 202111
  • 202205
  • 202211
  • 202305

Tested branch (Please provide the tested image version)

Description for the changelog

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

Comment thread platform/mellanox/get_component_versions/get_component_versions.j2 Outdated
@tirupatihemanth tirupatihemanth force-pushed the htirupati/compvers branch 4 times, most recently from 16b9b28 to fc0bc49 Compare March 20, 2025 22:42
tirupatihemanth pushed a commit that referenced this pull request May 2, 2025
…et#21405)

<!--
 Please make sure you've read and understood our contributing guidelines:
 https://github.com/Azure/SONiC/blob/gh-pages/CONTRIBUTING.md

 failure_prs.log skip_prs.log Make sure all your commits include a signature generated with `git commit -s` **

 If this is a bug fix, make sure your description includes "fixes #xxxx", or
 "closes #xxxx" or "resolves #xxxx"

 Please provide the following information:
-->

#### Why I did it

Adding the below fix from FRR FRRouting/frr#17297

This is to fix the following crash which is a statistical issue

```
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/lib/frr/zebra -A 127.0.0.1 -s 90000000 -M dplane_fpm_nl -M snmp'.
Program terminated with signal SIGABRT, Aborted.
#0 0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
[Current thread is 1 (Thread 0x7fccd6faf7c0 (LWP 36))]
(gdb) bt
#0 0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007fccd7302fb2 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#2 0x00007fccd72ed472 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#3 0x00007fccd75bb3a9 in _zlog_assert_failed (xref=xref@entry=0x7fccd7652380 <_xref.16>, extra=extra@entry=0x0) at ../lib/zlog.c:678
#4 0x00007fccd759b2fe in route_node_delete (node=<optimized out>) at ../lib/table.c:352
#5 0x00007fccd759b445 in route_unlock_node (node=0x0) at ../lib/table.h:258
#6 route_next (node=<optimized out>) at ../lib/table.c:436
#7 route_next (node=node@entry=0x56029d89e560) at ../lib/table.c:410
#8 0x000056029b6b6b7a in if_lookup_by_name_per_ns (ns=ns@entry=0x56029d873d90, ifname=ifname@entry=0x7fccc0029340 "PortChannel1020")
 at ../zebra/interface.c:312
#9 0x000056029b6b8b36 in zebra_if_dplane_ifp_handling (ctx=0x7fccc0029310) at ../zebra/interface.c:1867
#10 zebra_if_dplane_result (ctx=0x7fccc0029310) at ../zebra/interface.c:2221
#11 0x000056029b7137a9 in rib_process_dplane_results (thread=<optimized out>) at ../zebra/zebra_rib.c:4810
#12 0x00007fccd75a0e0d in thread_call (thread=thread@entry=0x7ffe8e553cc0) at ../lib/thread.c:1990
#13 0x00007fccd7559368 in frr_run (master=0x56029d65a040) at ../lib/libfrr.c:1198
#14 0x000056029b6ac317 in main (argc=9, argv=0x7ffe8e5540d8) at ../zebra/main.c:478
```

##### Work item tracking
- Microsoft ADO **(number only)**:

#### How I did it
Added patch.

#### How to verify it
Running BGP tests.

<!--
If PR needs to be backported, then the PR must be tested against the base branch and the earliest backport release branch and provide tested image version on these two branches. For example, if the PR is requested for master, 202211 and 202012, then the requester needs to provide test results on master and 202012.
-->

#### Which release branch to backport (provide reason below if selected)

<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->

- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
- [ ] 202205
- [ ] 202211
- [ ] 202305

#### Tested branch (Please provide the tested image version)

<!--
- Please provide tested image version
- e.g.
- [x] 20201231.100
-->

- [ ] <!-- image version 1 -->
- [ ] <!-- image version 2 -->

#### Description for the changelog
<!--
Write a short (one line) summary that describes the changes in this
pull request for inclusion in the changelog:
-->

<!--
 Ensure to add label/tag for the feature raised. example - PR#2174 under sonic-utilities repo. where, Generic Config and Update feature has been labelled as GCU.
-->

#### Link to config_db schema for YANG module changes
<!--
Provide a link to config_db schema for the table for which YANG model
is defined
Link should point to correct section on https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md
-->

#### A picture of a cute animal (not mandatory but encouraged)
tirupatihemanth pushed a commit that referenced this pull request Nov 7, 2025
…logs (sonic-net#56)

The `imklog` plugin of rsyslog collects the kernel logs from `/dev/kmsg` and
enqueues it to the syslog. With `CONFIG_PRINTK_TIME` the kernel messages are by
default prefixed with the elapsed time since boot. The `imklog` plugin parsing
these messages have a few options such as to keep the timestamps as such or to
interpret and adjust the syslog's reported time accordingly.

The rsylog release `8.2312.0` has fixes in interpreting these timestamps,
leading to the change in behavior observed in sonic-net#24386.

  https://salsa.debian.org/debian/rsyslog/-/blob/debian/8.2504.0-1/ChangeLog?ref_type=tags#L619

To restore the earlier behavior or retaining the kernel reported elapsed time,
disable `KlogParseKernelTimestamp` as this leads to removal of timestamp from
kernel messages and enable `KlogKeepKernelTimestamp` explicitly. The later is
required as the default is now to discard the kernel timestamp.

With this change, the logs retain the kernel timestamp:

    root@sonic:~# cat /var/log/syslog | grep "sonic.*kernel:" | head -n 3
    2025 Nov  4 05:15:14.918946 sonic NOTICE kernel: [    0.000000] Linux version 6.12.41+deb13-sonic-amd64 (debian-kernel@lists.debian.org) (x86_64-linux-gnu-gcc-14 (Debian 14.2.0-19) 14.2.0, GNU ld (GNU Binutils for Debian) 2.44) #1 SMP PREEMPT_DYNAMIC Debian 6.12.41-1 (2025-08-12)
    2025 Nov  4 05:15:14.919533 sonic INFO kernel: [    0.000000] Command line: BOOT_IMAGE=/image-trixie.0-dirty-20251102.122837/boot/vmlinuz-6.12.41+deb13-sonic-amd64 root=UUID=ac0b6826-f8a3-461f-a8ff-701df60d90b6 rw console=tty0 console=ttyS0,115200n8 quiet processor.max_cstate=1 intel_idle.max_cstate=0 net.ifnames=0 biosdevname=0 loop=image-trixie.0-dirty-20251102.122837/fs.squashfs loopfstype=squashfs apparmor=1 security=apparmor varlog_size=4096 usbcore.autosuspend=-1 intel_iommu=off modprobe.blacklist=gpio_ich,i2c-ismt,i2c_ismt,i2c-i801,i2c_i801 crashkernel=0M-2G:256M,2G-4G:320M,4G-8G:384M,8G-:448M acpi_no_watchdog
    2025 Nov  4 05:15:14.919536 sonic INFO kernel: [    0.000000] BIOS-provided physical RAM map:
    root@sonic:~# cat /var/log/syslog | grep "sonic.*kernel:" | tail -n 3
    2025 Nov  4 05:17:26.831607 sonic WARNING kernel: [  143.527486] PDDF_LED       set_status_led: Set [FANTRAY_LED;1] color[green]
    2025 Nov  4 05:17:26.912442 sonic WARNING kernel: [  143.607086] PDDF_LED       set_status_led: Set [FANTRAY_LED;2] color[green]
    2025 Nov  4 05:20:32.499634 sonic WARNING kernel: [  329.195319] PDDF_LED       set_status_led: Set [SYS_LED;0] color[amber]
    root@sonic:~#

Signed-off-by: Ramasamy Chandramouli <rachandr@celestica.com>
Co-authored-by: Ramasamy Chandramouli <rachandr@celestica.com>
tirupatihemanth pushed a commit that referenced this pull request Nov 19, 2025
…logs (sonic-net#56)

The `imklog` plugin of rsyslog collects the kernel logs from `/dev/kmsg` and
enqueues it to the syslog. With `CONFIG_PRINTK_TIME` the kernel messages are by
default prefixed with the elapsed time since boot. The `imklog` plugin parsing
these messages have a few options such as to keep the timestamps as such or to
interpret and adjust the syslog's reported time accordingly.

The rsylog release `8.2312.0` has fixes in interpreting these timestamps,
leading to the change in behavior observed in sonic-net#24386.

  https://salsa.debian.org/debian/rsyslog/-/blob/debian/8.2504.0-1/ChangeLog?ref_type=tags#L619

To restore the earlier behavior or retaining the kernel reported elapsed time,
disable `KlogParseKernelTimestamp` as this leads to removal of timestamp from
kernel messages and enable `KlogKeepKernelTimestamp` explicitly. The later is
required as the default is now to discard the kernel timestamp.

With this change, the logs retain the kernel timestamp:

    root@sonic:~# cat /var/log/syslog | grep "sonic.*kernel:" | head -n 3
    2025 Nov  4 05:15:14.918946 sonic NOTICE kernel: [    0.000000] Linux version 6.12.41+deb13-sonic-amd64 (debian-kernel@lists.debian.org) (x86_64-linux-gnu-gcc-14 (Debian 14.2.0-19) 14.2.0, GNU ld (GNU Binutils for Debian) 2.44) #1 SMP PREEMPT_DYNAMIC Debian 6.12.41-1 (2025-08-12)
    2025 Nov  4 05:15:14.919533 sonic INFO kernel: [    0.000000] Command line: BOOT_IMAGE=/image-trixie.0-dirty-20251102.122837/boot/vmlinuz-6.12.41+deb13-sonic-amd64 root=UUID=ac0b6826-f8a3-461f-a8ff-701df60d90b6 rw console=tty0 console=ttyS0,115200n8 quiet processor.max_cstate=1 intel_idle.max_cstate=0 net.ifnames=0 biosdevname=0 loop=image-trixie.0-dirty-20251102.122837/fs.squashfs loopfstype=squashfs apparmor=1 security=apparmor varlog_size=4096 usbcore.autosuspend=-1 intel_iommu=off modprobe.blacklist=gpio_ich,i2c-ismt,i2c_ismt,i2c-i801,i2c_i801 crashkernel=0M-2G:256M,2G-4G:320M,4G-8G:384M,8G-:448M acpi_no_watchdog
    2025 Nov  4 05:15:14.919536 sonic INFO kernel: [    0.000000] BIOS-provided physical RAM map:
    root@sonic:~# cat /var/log/syslog | grep "sonic.*kernel:" | tail -n 3
    2025 Nov  4 05:17:26.831607 sonic WARNING kernel: [  143.527486] PDDF_LED       set_status_led: Set [FANTRAY_LED;1] color[green]
    2025 Nov  4 05:17:26.912442 sonic WARNING kernel: [  143.607086] PDDF_LED       set_status_led: Set [FANTRAY_LED;2] color[green]
    2025 Nov  4 05:20:32.499634 sonic WARNING kernel: [  329.195319] PDDF_LED       set_status_led: Set [SYS_LED;0] color[amber]
    root@sonic:~#

Signed-off-by: Ramasamy Chandramouli <rachandr@celestica.com>
Co-authored-by: Ramasamy Chandramouli <rachandr@celestica.com>
tirupatihemanth pushed a commit that referenced this pull request Nov 25, 2025
…logs (sonic-net#56)

The `imklog` plugin of rsyslog collects the kernel logs from `/dev/kmsg` and
enqueues it to the syslog. With `CONFIG_PRINTK_TIME` the kernel messages are by
default prefixed with the elapsed time since boot. The `imklog` plugin parsing
these messages have a few options such as to keep the timestamps as such or to
interpret and adjust the syslog's reported time accordingly.

The rsylog release `8.2312.0` has fixes in interpreting these timestamps,
leading to the change in behavior observed in sonic-net#24386.

  https://salsa.debian.org/debian/rsyslog/-/blob/debian/8.2504.0-1/ChangeLog?ref_type=tags#L619

To restore the earlier behavior or retaining the kernel reported elapsed time,
disable `KlogParseKernelTimestamp` as this leads to removal of timestamp from
kernel messages and enable `KlogKeepKernelTimestamp` explicitly. The later is
required as the default is now to discard the kernel timestamp.

With this change, the logs retain the kernel timestamp:

    root@sonic:~# cat /var/log/syslog | grep "sonic.*kernel:" | head -n 3
    2025 Nov  4 05:15:14.918946 sonic NOTICE kernel: [    0.000000] Linux version 6.12.41+deb13-sonic-amd64 (debian-kernel@lists.debian.org) (x86_64-linux-gnu-gcc-14 (Debian 14.2.0-19) 14.2.0, GNU ld (GNU Binutils for Debian) 2.44) #1 SMP PREEMPT_DYNAMIC Debian 6.12.41-1 (2025-08-12)
    2025 Nov  4 05:15:14.919533 sonic INFO kernel: [    0.000000] Command line: BOOT_IMAGE=/image-trixie.0-dirty-20251102.122837/boot/vmlinuz-6.12.41+deb13-sonic-amd64 root=UUID=ac0b6826-f8a3-461f-a8ff-701df60d90b6 rw console=tty0 console=ttyS0,115200n8 quiet processor.max_cstate=1 intel_idle.max_cstate=0 net.ifnames=0 biosdevname=0 loop=image-trixie.0-dirty-20251102.122837/fs.squashfs loopfstype=squashfs apparmor=1 security=apparmor varlog_size=4096 usbcore.autosuspend=-1 intel_iommu=off modprobe.blacklist=gpio_ich,i2c-ismt,i2c_ismt,i2c-i801,i2c_i801 crashkernel=0M-2G:256M,2G-4G:320M,4G-8G:384M,8G-:448M acpi_no_watchdog
    2025 Nov  4 05:15:14.919536 sonic INFO kernel: [    0.000000] BIOS-provided physical RAM map:
    root@sonic:~# cat /var/log/syslog | grep "sonic.*kernel:" | tail -n 3
    2025 Nov  4 05:17:26.831607 sonic WARNING kernel: [  143.527486] PDDF_LED       set_status_led: Set [FANTRAY_LED;1] color[green]
    2025 Nov  4 05:17:26.912442 sonic WARNING kernel: [  143.607086] PDDF_LED       set_status_led: Set [FANTRAY_LED;2] color[green]
    2025 Nov  4 05:20:32.499634 sonic WARNING kernel: [  329.195319] PDDF_LED       set_status_led: Set [SYS_LED;0] color[amber]
    root@sonic:~#

Signed-off-by: Ramasamy Chandramouli <rachandr@celestica.com>
Co-authored-by: Ramasamy Chandramouli <rachandr@celestica.com>
tirupatihemanth pushed a commit that referenced this pull request Dec 4, 2025
…logs (sonic-net#56)

The `imklog` plugin of rsyslog collects the kernel logs from `/dev/kmsg` and
enqueues it to the syslog. With `CONFIG_PRINTK_TIME` the kernel messages are by
default prefixed with the elapsed time since boot. The `imklog` plugin parsing
these messages have a few options such as to keep the timestamps as such or to
interpret and adjust the syslog's reported time accordingly.

The rsylog release `8.2312.0` has fixes in interpreting these timestamps,
leading to the change in behavior observed in sonic-net#24386.

  https://salsa.debian.org/debian/rsyslog/-/blob/debian/8.2504.0-1/ChangeLog?ref_type=tags#L619

To restore the earlier behavior or retaining the kernel reported elapsed time,
disable `KlogParseKernelTimestamp` as this leads to removal of timestamp from
kernel messages and enable `KlogKeepKernelTimestamp` explicitly. The later is
required as the default is now to discard the kernel timestamp.

With this change, the logs retain the kernel timestamp:

    root@sonic:~# cat /var/log/syslog | grep "sonic.*kernel:" | head -n 3
    2025 Nov  4 05:15:14.918946 sonic NOTICE kernel: [    0.000000] Linux version 6.12.41+deb13-sonic-amd64 (debian-kernel@lists.debian.org) (x86_64-linux-gnu-gcc-14 (Debian 14.2.0-19) 14.2.0, GNU ld (GNU Binutils for Debian) 2.44) #1 SMP PREEMPT_DYNAMIC Debian 6.12.41-1 (2025-08-12)
    2025 Nov  4 05:15:14.919533 sonic INFO kernel: [    0.000000] Command line: BOOT_IMAGE=/image-trixie.0-dirty-20251102.122837/boot/vmlinuz-6.12.41+deb13-sonic-amd64 root=UUID=ac0b6826-f8a3-461f-a8ff-701df60d90b6 rw console=tty0 console=ttyS0,115200n8 quiet processor.max_cstate=1 intel_idle.max_cstate=0 net.ifnames=0 biosdevname=0 loop=image-trixie.0-dirty-20251102.122837/fs.squashfs loopfstype=squashfs apparmor=1 security=apparmor varlog_size=4096 usbcore.autosuspend=-1 intel_iommu=off modprobe.blacklist=gpio_ich,i2c-ismt,i2c_ismt,i2c-i801,i2c_i801 crashkernel=0M-2G:256M,2G-4G:320M,4G-8G:384M,8G-:448M acpi_no_watchdog
    2025 Nov  4 05:15:14.919536 sonic INFO kernel: [    0.000000] BIOS-provided physical RAM map:
    root@sonic:~# cat /var/log/syslog | grep "sonic.*kernel:" | tail -n 3
    2025 Nov  4 05:17:26.831607 sonic WARNING kernel: [  143.527486] PDDF_LED       set_status_led: Set [FANTRAY_LED;1] color[green]
    2025 Nov  4 05:17:26.912442 sonic WARNING kernel: [  143.607086] PDDF_LED       set_status_led: Set [FANTRAY_LED;2] color[green]
    2025 Nov  4 05:20:32.499634 sonic WARNING kernel: [  329.195319] PDDF_LED       set_status_led: Set [SYS_LED;0] color[amber]
    root@sonic:~#

Signed-off-by: Ramasamy Chandramouli <rachandr@celestica.com>
Co-authored-by: Ramasamy Chandramouli <rachandr@celestica.com>
tirupatihemanth pushed a commit that referenced this pull request Dec 5, 2025
#### Why I did it
If one python wheel is already installed inside slave container, it will not install again. Below is a sample log:
```
sed: -e expression #1, char 11: extra characters after command
WARNING: The directory '/var/user/.cache/pip' or its parent directory is not owned or is not writable by the current user. The cache has been disabled. Check the permissions and owner of that directory. If executing pip with sudo, you should use sudo's -H flag.
Processing ./target/python-wheels/bookworm/sonic_yang_models-1.0-py3-none-any.whl
sonic-yang-models is already installed with the same version as the provided wheel. Use --force-reinstall to force an installation of the wheel.
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager, possibly rendering your system unusable.It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv. Use the --root-user-action option if you know what you are doing and want to suppress this warning.

[notice] A new release of pip is available: 24.2 -> 25.3
[notice] To update, run: python3 -m pip install --upgrade pip
Build end time: Wed Dec 3 22:53:07 UTC 2025
Elapsed time: 0h 0m 1s
```
 However, we expect to reinstall the python wheel for target `$(PYTHON_WHEELS_PATH)/%-install`

##### Work item tracking
- Microsoft ADO **(number only)**:

#### How I did it
Update slave.mk to enasure force install the python wheel.

#### How to verify it
After this change, local build will successfully force install the python wheel. See new logs:
```
sed: -e expression #1, char 11: extra characters after command
WARNING: The directory '/var/qiluo/.cache/pip' or its parent directory is not owned or is not writable by the current user. The cache has been disabled. Check the permissions and owner of that directory. If executing pip with sudo, you should use sudo's -H flag.
Processing ./target/python-wheels/bookworm/sonic_yang_models-1.0-py3-none-any.whl
Installing collected packages: sonic-yang-models
  Attempting uninstall: sonic-yang-models
    Found existing installation: sonic-yang-models 1.0
    Uninstalling sonic-yang-models-1.0:
      Successfully uninstalled sonic-yang-models-1.0
Successfully installed sonic-yang-models-1.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager, possibly rendering your system unusable.It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv. Use the --root-user-action option if you know what you are doing and want to suppress this warning.

[notice] A new release of pip is available: 24.2 -> 25.3
[notice] To update, run: python3 -m pip install --upgrade pip
Build end time: Wed Dec 3 23:59:31 UTC 2025
```
tirupatihemanth pushed a commit that referenced this pull request Dec 30, 2025
…logs

The `imklog` plugin of rsyslog collects the kernel logs from `/dev/kmsg` and
enqueues it to the syslog. With `CONFIG_PRINTK_TIME` the kernel messages are by
default prefixed with the elapsed time since boot. The `imklog` plugin parsing
these messages have a few options such as to keep the timestamps as such or to
interpret and adjust the syslog's reported time accordingly.

The rsylog release `8.2312.0` has fixes in interpreting these timestamps,
leading to the change in behavior observed in sonic-net#24386.

  https://salsa.debian.org/debian/rsyslog/-/blob/debian/8.2504.0-1/ChangeLog?ref_type=tags#L619

To restore the earlier behavior or retaining the kernel reported elapsed time,
disable `KlogParseKernelTimestamp` as this leads to removal of timestamp from
kernel messages and enable `KlogKeepKernelTimestamp` explicitly. The later is
required as the default is now to discard the kernel timestamp.

With this change, the logs retain the kernel timestamp:

    root@sonic:~# cat /var/log/syslog | grep "sonic.*kernel:" | head -n 3
    2025 Nov  4 05:15:14.918946 sonic NOTICE kernel: [    0.000000] Linux version 6.12.41+deb13-sonic-amd64 (debian-kernel@lists.debian.org) (x86_64-linux-gnu-gcc-14 (Debian 14.2.0-19) 14.2.0, GNU ld (GNU Binutils for Debian) 2.44) #1 SMP PREEMPT_DYNAMIC Debian 6.12.41-1 (2025-08-12)
    2025 Nov  4 05:15:14.919533 sonic INFO kernel: [    0.000000] Command line: BOOT_IMAGE=/image-trixie.0-dirty-20251102.122837/boot/vmlinuz-6.12.41+deb13-sonic-amd64 root=UUID=ac0b6826-f8a3-461f-a8ff-701df60d90b6 rw console=tty0 console=ttyS0,115200n8 quiet processor.max_cstate=1 intel_idle.max_cstate=0 net.ifnames=0 biosdevname=0 loop=image-trixie.0-dirty-20251102.122837/fs.squashfs loopfstype=squashfs apparmor=1 security=apparmor varlog_size=4096 usbcore.autosuspend=-1 intel_iommu=off modprobe.blacklist=gpio_ich,i2c-ismt,i2c_ismt,i2c-i801,i2c_i801 crashkernel=0M-2G:256M,2G-4G:320M,4G-8G:384M,8G-:448M acpi_no_watchdog
    2025 Nov  4 05:15:14.919536 sonic INFO kernel: [    0.000000] BIOS-provided physical RAM map:
    root@sonic:~# cat /var/log/syslog | grep "sonic.*kernel:" | tail -n 3
    2025 Nov  4 05:17:26.831607 sonic WARNING kernel: [  143.527486] PDDF_LED       set_status_led: Set [FANTRAY_LED;1] color[green]
    2025 Nov  4 05:17:26.912442 sonic WARNING kernel: [  143.607086] PDDF_LED       set_status_led: Set [FANTRAY_LED;2] color[green]
    2025 Nov  4 05:20:32.499634 sonic WARNING kernel: [  329.195319] PDDF_LED       set_status_led: Set [SYS_LED;0] color[amber]
    root@sonic:~#

Signed-off-by: Ramasamy Chandramouli <rachandr@celestica.com>
Co-authored-by: Ramasamy Chandramouli <rachandr@celestica.com>
tirupatihemanth pushed a commit that referenced this pull request Mar 13, 2026
…net#25643)

* [build] Add build timing report and dependency analysis tools

Add three scripts for build performance instrumentation:

- scripts/build-timing-report.sh: Parse per-package timing from build
  logs (HEADER/FOOTER timestamps), generate sorted duration table,
  phase breakdown, parallelism timeline, and CSV export.

- scripts/build-dep-graph.py: Parse rules/*.mk dependency graph,
  compute critical path, fan-out/fan-in bottleneck analysis, and
  generate DOT/JSON output for visualization.

- scripts/build-resource-monitor.sh: Sample CPU, memory, disk I/O,
  and Docker container count during builds for resource utilization
  analysis.

Add "make build-report" target to slave.mk that runs the timing
report and dependency analysis after a build completes.

Example output from a VS build on 24-core/30GB machine:
- 210 packages built in 53m wall time (173m CPU)
- Max concurrency: 5 (with SONIC_CONFIG_BUILD_JOBS=4)
- Critical path: 14 packages deep (libnl -> libswsscommon -> utilities)
- Top bottleneck: LIBSWSSCOMMON with 48 downstream dependents

Signed-off-by: Rustiqly <rustiqly@users.noreply.github.com>

* Address Copilot review: fix 17 bugs in build analysis scripts

- Use free -m with division instead of free -g to avoid rounding (#1)
- Add = and ?= to Makefile dependency regex patterns (#2, #7)
- CPU calculation now uses /proc/stat delta (two reads) (#3, #14)
- Fix misleading 'critical path estimate' comment (#4)
- Fix parallelism timeline comment (60s not 10s) (#5)
- Include after-relationship packages in fan stats (#6)
- Guard disk I/O division by zero when INTERVAL<=1 (#8)
- Remove unused elapsed_line variable (#9)
- Remove redundant LIBSWSSCOMMON_DBG check (#10)
- Remove active_make_jobs from CSV header comment (#11)
- Wire up _RDEPENDS parsing to build reverse deps (#12)
- Remove unnecessary 'if v' filter on rdeps JSON (#13)
- Remove unused REPORT_FORMAT parameter (#15)
- Add cycle detection to critical path algorithm (#16)
- Add execute permission check for companion scripts (#17)

Signed-off-by: Rustiqly <rustiqly@users.noreply.github.com>

---------

Signed-off-by: Rustiqly <rustiqly@users.noreply.github.com>
Co-authored-by: Rustiqly <rustiqly@users.noreply.github.com>
tirupatihemanth added a commit that referenced this pull request Apr 3, 2026
…dating udevd rules (sonic-net#26343)

- Why I did it
On SONiC SmartSwitch platforms with DPUs, systemd-udevd crashes with SIGABRT on every reboot when DPU firmware initialization is slow. During the initramfs boot phase, a standalone systemd-udevd daemon is started to handle device discovery. If DPU firmware takes longer than the 60-second udevadm settle timeout (BlueField-3 DPUs can take 120 seconds each in the failure case when they are stuck), the initramfs cannot stop this udevd before switch_root. The stale process survives into the real system but is never chrooted into the overlayfs root, leaving it with a broken filesystem view. When dpu-udev-manager.sh writes udev rules, the stale udevd detects the change and crashes on an assertion in systemd's chase() path resolution (assert(path_is_absolute(p)) at chase.c:648), because dir_fd_is_root() returns false for a process whose root still points to the initramfs rootfs rather than the overlayfs.

This triggers a systemd issue : systemd/systemd#29559 which maintainers doesn't consider as a bug from systemd side. Raising this fix for our usecase.

Core was generated by `/usr/lib/systemd/systemd-udevd --daemon --resolve-names=never'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007f29fe7f695c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0  0x00007f29fe7f695c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007f29fe7a1cc2 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007f29fe78a4ac in abort () from /lib/x86_64-linux-gnu/libc.so.6
#3  0x00007f29fea50c11 in ?? () from /usr/lib/x86_64-linux-gnu/systemd/libsystemd-shared-257.so
#4  0x00007f29feb94a8b in chase () from /usr/lib/x86_64-linux-gnu/systemd/libsystemd-shared-257.so
#5  0x00007f29feb956e2 in chase_and_opendir () from /usr/lib/x86_64-linux-gnu/systemd/libsystemd-shared-257.so
#6  0x00007f29feb9a609 in conf_files_list_strv () from /usr/lib/x86_64-linux-gnu/systemd/libsystemd-shared-257.so
#7  0x00007f29fea913e8 in config_get_stats_by_path () from /usr/lib/x86_64-linux-gnu/systemd/libsystemd-shared-257.so
#8  0x0000559f295519cf in ?? ()
#9  0x0000559f29553a77 in ?? ()
#10 0x00007f29fec36055 in ?? () from /usr/lib/x86_64-linux-gnu/systemd/libsystemd-shared-257.so
#11 0x00007f29fec3668d in sd_event_dispatch () from /usr/lib/x86_64-linux-gnu/systemd/libsystemd-shared-257.so
#12 0x00007f29fec394a8 in sd_event_run () from /usr/lib/x86_64-linux-gnu/systemd/libsystemd-shared-257.so
#13 0x00007f29fec396c7 in sd_event_loop () from /usr/lib/x86_64-linux-gnu/systemd/libsystemd-shared-257.so
#14 0x0000559f29545820 in ?? ()
#15 0x00007f29fe78bca8 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#16 0x00007f29fe78bd65 in __libc_start_main () from /lib/x86_64-linux-gnu/libc.so.6
#17 0x0000559f29545c51 in ?? ()

- How I did it
Added a kill_stale_udevd() function to dpu-udev-manager.sh that runs before writing the udev rules. It identifies the systemd-managed udevd PID via systemctl show, then kills any other systemd-udevd --daemon process that doesn't match -- these are leftover initramfs instances. If no stale process exists (e.g. DPUs are healthy and the initramfs udevd exited cleanly), the function is a no-op.

- How to verify it
Deploy the image on a SmartSwitch with DPUs in a state where firmware initialization times out (>60s per DPU) by stopping image installation before firmware install step
Reboot the switch
Verify no new systemd-udevd coredumps in /var/core/
Verify the stale process was killed: journalctl -b 0 | grep dpu-udev-manager should show killing stale initramfs udevd PID (systemd udevd is PID )
Verify systemd-udevd.service is healthy: systemctl status systemd-udevd should show active (running)
Verify DPU udev rules were written: cat /etc/udev/rules.d/92-midplane-intf.rules should contain the DPU interface naming rules

Signed-off-by: Hemanth Kumar Tirupati <tirupatihemanthkumar@gmail.com>
tirupatihemanth pushed a commit that referenced this pull request Apr 6, 2026
…logs

The `imklog` plugin of rsyslog collects the kernel logs from `/dev/kmsg` and
enqueues it to the syslog. With `CONFIG_PRINTK_TIME` the kernel messages are by
default prefixed with the elapsed time since boot. The `imklog` plugin parsing
these messages have a few options such as to keep the timestamps as such or to
interpret and adjust the syslog's reported time accordingly.

The rsylog release `8.2312.0` has fixes in interpreting these timestamps,
leading to the change in behavior observed in sonic-net#24386.

  https://salsa.debian.org/debian/rsyslog/-/blob/debian/8.2504.0-1/ChangeLog?ref_type=tags#L619

To restore the earlier behavior or retaining the kernel reported elapsed time,
disable `KlogParseKernelTimestamp` as this leads to removal of timestamp from
kernel messages and enable `KlogKeepKernelTimestamp` explicitly. The later is
required as the default is now to discard the kernel timestamp.

With this change, the logs retain the kernel timestamp:

    root@sonic:~# cat /var/log/syslog | grep "sonic.*kernel:" | head -n 3
    2025 Nov  4 05:15:14.918946 sonic NOTICE kernel: [    0.000000] Linux version 6.12.41+deb13-sonic-amd64 (debian-kernel@lists.debian.org) (x86_64-linux-gnu-gcc-14 (Debian 14.2.0-19) 14.2.0, GNU ld (GNU Binutils for Debian) 2.44) #1 SMP PREEMPT_DYNAMIC Debian 6.12.41-1 (2025-08-12)
    2025 Nov  4 05:15:14.919533 sonic INFO kernel: [    0.000000] Command line: BOOT_IMAGE=/image-trixie.0-dirty-20251102.122837/boot/vmlinuz-6.12.41+deb13-sonic-amd64 root=UUID=ac0b6826-f8a3-461f-a8ff-701df60d90b6 rw console=tty0 console=ttyS0,115200n8 quiet processor.max_cstate=1 intel_idle.max_cstate=0 net.ifnames=0 biosdevname=0 loop=image-trixie.0-dirty-20251102.122837/fs.squashfs loopfstype=squashfs apparmor=1 security=apparmor varlog_size=4096 usbcore.autosuspend=-1 intel_iommu=off modprobe.blacklist=gpio_ich,i2c-ismt,i2c_ismt,i2c-i801,i2c_i801 crashkernel=0M-2G:256M,2G-4G:320M,4G-8G:384M,8G-:448M acpi_no_watchdog
    2025 Nov  4 05:15:14.919536 sonic INFO kernel: [    0.000000] BIOS-provided physical RAM map:
    root@sonic:~# cat /var/log/syslog | grep "sonic.*kernel:" | tail -n 3
    2025 Nov  4 05:17:26.831607 sonic WARNING kernel: [  143.527486] PDDF_LED       set_status_led: Set [FANTRAY_LED;1] color[green]
    2025 Nov  4 05:17:26.912442 sonic WARNING kernel: [  143.607086] PDDF_LED       set_status_led: Set [FANTRAY_LED;2] color[green]
    2025 Nov  4 05:20:32.499634 sonic WARNING kernel: [  329.195319] PDDF_LED       set_status_led: Set [SYS_LED;0] color[amber]
    root@sonic:~#

Signed-off-by: Ramasamy Chandramouli <rachandr@celestica.com>
Co-authored-by: Ramasamy Chandramouli <rachandr@celestica.com>
tirupatihemanth pushed a commit that referenced this pull request Apr 6, 2026
#### Why I did it

Kdump takes a long time to complete because makedumpfile is version 1.7.6 which is not compatible with 6.12
([ref](https://github.com/makedumpfile/makedumpfile/releases/tag/1.7.6)). It incorrectly classifies free pages as in use by kernel, resulting in extremely large core dump file and taking a long time to generate. On 6.1, makedumpfile is 1.7.2.

Fixes sonic-net#25890

On 6.12., core dump is large with no `FREE` pages

```
$ uname -a
Linux gold404 6.12.41+deb13-sonic-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.12.41-1 (2025-08-12) x86_64 GNU/Linux

$ ls -l /var/crash/202602262349/
total 7412984
-rw------- 1 root root 304964 Feb 26 23:49 dmesg.202602262349
-rw-r--r-- 1 root root 7590581090 Feb 26 23:53 kdump.202602262349

$ dpkg -l makedumpfile
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name Version Architecture Description
+++-==============-============-============-=================================
ii makedumpfile 1:1.7.6-1 amd64 VMcore extraction tool

$ sudo makedumpfile --mem-usage /proc/kcore
The kernel version is not supported.
The makedumpfile operation may be incomplete.
TYPE		PAGES			EXCLUDABLE	DESCRIPTION
----------------------------------------------------------------------
ZERO		89343 	yes		Pages filled with zero
NON_PRI_CACHE	646424 	yes		Cache pages without private flag
PRI_CACHE	65827 	yes		Cache pages with private flag
USER		666096 	yes		User process pages
FREE		0 	yes		Free pages
KERN_DATA	6885323 	no		Dumpable kernel data
page size:		4096
Total pages on system:	8353013
Total size on system:	34213941248 Byte
```

On 6.1, core dump is much smaller

```
$ ls -l /var/crash/202602262359/
total 153752
-rw------- 1 root root 267051 Feb 26 23:59 dmesg.202602262359
-rw-r--r-- 1 root root 157169130 Feb 26 23:59 kdump.202602262359

$ dpkg -l makedumpfile
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name Version Architecture Description
+++-==============-============-============-=================================
ii makedumpfile 1:1.7.2-1 amd64 VMcore extraction tool

$ sudo makedumpfile --mem-usage /proc/kcore
The kernel version is not supported.
The makedumpfile operation may be incomplete.
TYPE		PAGES			EXCLUDABLE	DESCRIPTION
----------------------------------------------------------------------
ZERO		95856 	yes		Pages filled with zero
NON_PRI_CACHE	615186 	yes		Cache pages without private flag
PRI_CACHE	15085 	yes		Cache pages with private flag
USER		877145 	yes		User process pages
FREE		6354174 	yes		Free pages
KERN_DATA	266543 	no		Dumpable kernel data
page size:		4096
Total pages on system:	8223989
Total size on system:	33685458944 Byte
```

##### Work item tracking
- Microsoft ADO **(number only)**:

#### How I did it

Include makedumpfile 1.7.7
([ref](https://github.com/makedumpfile/makedumpfile/releases/tag/1.7.7)) in the build image, which has support for 6.12 kernel and fixes an issue specific to 6.12.

#### How to verify it

- Verify 1.7.7 makedumpfile is installed
- Verify FREE pages are classified correctly

```
$ dpkg -l makedumpfile
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name Version Architecture Description
+++-==============-============-============-=================================
ii makedumpfile 1:1.7.7-1+b1 amd64 VMcore extraction tool

$ sudo makedumpfile --mem-usage /proc/kcore

TYPE		PAGES			EXCLUDABLE	DESCRIPTION
----------------------------------------------------------------------
ZERO		66146 	yes		Pages filled with zero
NON_PRI_CACHE	586502 	yes		Cache pages without private flag
PRI_CACHE	49224 	yes		Cache pages with private flag
USER		455887 	yes		User process pages
FREE		6820384 	yes		Free pages
KERN_DATA	374870 	no		Dumpable kernel data

page size:		4096
Total pages on system:	8353013
Total size on system:	34213941248 Byte
```

#### Which release branch to backport (provide reason below if selected)

<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->

- [ ] 202305
- [ ] 202311
- [ ] 202405
- [ ] 202411
- [ ] 202505
- [x] 202511

#### Tested branch (Please provide the tested image version)

<!--
- Please provide tested image version
- e.g.
- [x] 20201231.100
-->

- [ ] <!-- image version 1 -->
- [ ] <!-- image version 2 -->

#### Description for the changelog
<!--
Write a short (one line) summary that describes the changes in this
pull request for inclusion in the changelog:
-->

<!--
 Ensure to add label/tag for the feature raised. example - PR#2174 under sonic-utilities repo. where, Generic Config and Update feature has been labelled as GCU.
-->

#### Link to config_db schema for YANG module changes
<!--
Provide a link to config_db schema for the table for which YANG model
is defined
Link should point to correct section on https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md
-->

Signed-off-by: Sonic Build Admin <sonicbld@microsoft.com>

#### A picture of a cute animal (not mandatory but encouraged)
tirupatihemanth pushed a commit that referenced this pull request Apr 6, 2026
…dating udevd rules (sonic-net#26573)

<!--
 Please make sure you've read and understood our contributing guidelines:
 https://github.com/Azure/SONiC/blob/gh-pages/CONTRIBUTING.md

 failure_prs.log skip_prs.log Make sure all your commits include a signature generated with `git commit -s` **

 If this is a bug fix, make sure your description includes "fixes #xxxx", or
 "closes #xxxx" or "resolves #xxxx"

 Please provide the following information:
-->

#### Why I did it
On SONiC SmartSwitch platforms with DPUs, systemd-udevd crashes with SIGABRT on every reboot when DPU firmware initialization is slow. During the initramfs boot phase, a standalone systemd-udevd daemon is started to handle device discovery. If DPU firmware takes longer than the 60-second udevadm settle timeout (BlueField-3 DPUs can take 120 seconds each in the failure case when they are stuck), the initramfs cannot stop this udevd before switch_root. The stale process survives into the real system but is never chrooted into the overlayfs root, leaving it with a broken filesystem view. When dpu-udev-manager.sh writes udev rules, the stale udevd detects the change and crashes on an assertion in systemd's chase() path resolution (assert(path_is_absolute(p)) at chase.c:648), because dir_fd_is_root() returns false for a process whose root still points to the initramfs rootfs rather than the overlayfs.

This triggers a systemd issue : systemd/systemd#29559 which maintainers doesn't consider as a bug from systemd side. Raising this fix for our usecase.

```
Core was generated by `/usr/lib/systemd/systemd-udevd --daemon --resolve-names=never'.
Program terminated with signal SIGABRT, Aborted.
#0 0x00007f29fe7f695c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0 0x00007f29fe7f695c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007f29fe7a1cc2 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#2 0x00007f29fe78a4ac in abort () from /lib/x86_64-linux-gnu/libc.so.6
#3 0x00007f29fea50c11 in ?? () from /usr/lib/x86_64-linux-gnu/systemd/libsystemd-shared-257.so
#4 0x00007f29feb94a8b in chase () from /usr/lib/x86_64-linux-gnu/systemd/libsystemd-shared-257.so
#5 0x00007f29feb956e2 in chase_and_opendir () from /usr/lib/x86_64-linux-gnu/systemd/libsystemd-shared-257.so
#6 0x00007f29feb9a609 in conf_files_list_strv () from /usr/lib/x86_64-linux-gnu/systemd/libsystemd-shared-257.so
#7 0x00007f29fea913e8 in config_get_stats_by_path () from /usr/lib/x86_64-linux-gnu/systemd/libsystemd-shared-257.so
#8 0x0000559f295519cf in ?? ()
#9 0x0000559f29553a77 in ?? ()
#10 0x00007f29fec36055 in ?? () from /usr/lib/x86_64-linux-gnu/systemd/libsystemd-shared-257.so
#11 0x00007f29fec3668d in sd_event_dispatch () from /usr/lib/x86_64-linux-gnu/systemd/libsystemd-shared-257.so
#12 0x00007f29fec394a8 in sd_event_run () from /usr/lib/x86_64-linux-gnu/systemd/libsystemd-shared-257.so
#13 0x00007f29fec396c7 in sd_event_loop () from /usr/lib/x86_64-linux-gnu/systemd/libsystemd-shared-257.so
#14 0x0000559f29545820 in ?? ()
#15 0x00007f29fe78bca8 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#16 0x00007f29fe78bd65 in __libc_start_main () from /lib/x86_64-linux-gnu/libc.so.6
#17 0x0000559f29545c51 in ?? ()

```

#### How I did it
Added a kill_stale_udevd() function to dpu-udev-manager.sh that runs before writing the udev rules. It identifies the systemd-managed udevd PID via systemctl show, then kills any other systemd-udevd --daemon process that doesn't match -- these are leftover initramfs instances. If no stale process exists (e.g. DPUs are healthy and the initramfs udevd exited cleanly), the function is a no-op.

#### How to verify it

<!--
If PR needs to be backported, then the PR must be tested against the base branch and the earliest backport release branch and provide tested image version on these two branches. For example, if the PR is requested for master, 202211 and 202012, then the requester needs to provide test results on master and 202012.
-->
- Deploy the image on a SmartSwitch with DPUs in a state where firmware initialization times out (>60s per DPU) by stopping image installation before firmware install step
- Reboot the switch
- Verify no new systemd-udevd coredumps in /var/core/
- Verify the stale process was killed: journalctl -b 0 | grep dpu-udev-manager should show killing stale initramfs udevd PID <X> (systemd udevd is PID <Y>)
- Verify systemd-udevd.service is healthy: systemctl status systemd-udevd should show active (running)
- Verify DPU udev rules were written: cat /etc/udev/rules.d/92-midplane-intf.rules should contain the DPU interface naming rules

Signed-off-by: Sonic Build Admin <sonicbld@microsoft.com>
tirupatihemanth pushed a commit that referenced this pull request Apr 8, 2026
…net#25643)

* [build] Add build timing report and dependency analysis tools

Add three scripts for build performance instrumentation:

- scripts/build-timing-report.sh: Parse per-package timing from build
  logs (HEADER/FOOTER timestamps), generate sorted duration table,
  phase breakdown, parallelism timeline, and CSV export.

- scripts/build-dep-graph.py: Parse rules/*.mk dependency graph,
  compute critical path, fan-out/fan-in bottleneck analysis, and
  generate DOT/JSON output for visualization.

- scripts/build-resource-monitor.sh: Sample CPU, memory, disk I/O,
  and Docker container count during builds for resource utilization
  analysis.

Add "make build-report" target to slave.mk that runs the timing
report and dependency analysis after a build completes.

Example output from a VS build on 24-core/30GB machine:
- 210 packages built in 53m wall time (173m CPU)
- Max concurrency: 5 (with SONIC_CONFIG_BUILD_JOBS=4)
- Critical path: 14 packages deep (libnl -> libswsscommon -> utilities)
- Top bottleneck: LIBSWSSCOMMON with 48 downstream dependents

Signed-off-by: Rustiqly <rustiqly@users.noreply.github.com>

* Address Copilot review: fix 17 bugs in build analysis scripts

- Use free -m with division instead of free -g to avoid rounding (#1)
- Add = and ?= to Makefile dependency regex patterns (#2, #7)
- CPU calculation now uses /proc/stat delta (two reads) (#3, #14)
- Fix misleading 'critical path estimate' comment (#4)
- Fix parallelism timeline comment (60s not 10s) (#5)
- Include after-relationship packages in fan stats (#6)
- Guard disk I/O division by zero when INTERVAL<=1 (#8)
- Remove unused elapsed_line variable (#9)
- Remove redundant LIBSWSSCOMMON_DBG check (#10)
- Remove active_make_jobs from CSV header comment (#11)
- Wire up _RDEPENDS parsing to build reverse deps (#12)
- Remove unnecessary 'if v' filter on rdeps JSON (#13)
- Remove unused REPORT_FORMAT parameter (#15)
- Add cycle detection to critical path algorithm (#16)
- Add execute permission check for companion scripts (#17)

Signed-off-by: Rustiqly <rustiqly@users.noreply.github.com>

---------

Signed-off-by: Rustiqly <rustiqly@users.noreply.github.com>
Co-authored-by: Rustiqly <rustiqly@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants