Skip to content

[202405] Restart PMON when SWSS flushes APPL_DB on Arista SKUs that use media_settings.json#1080

Merged
arlakshm merged 1 commit intoAzure:202405from
arista-nwolfe:202405-restart-pmon-on-swss-flush
May 2, 2025
Merged

[202405] Restart PMON when SWSS flushes APPL_DB on Arista SKUs that use media_settings.json#1080
arlakshm merged 1 commit intoAzure:202405from
arista-nwolfe:202405-restart-pmon-on-swss-flush

Conversation

@arista-nwolfe
Copy link
Copy Markdown
Contributor

Workaround for sonic-net/sonic-buildimage#21902

SWSS startup causes APPL_DB to be flushed.
This results in PORT_TABLE:Ethernet# losing it's media_settings tunings populated by XCVRD.

Without this change (tuning lost):

sonic-db-cli -n asic0 APPL_DB hgetall "PORT_TABLE:Ethernet96"
{'alias': 'Ethernet13/1', 'asic_port_name': 'Eth96', 'core_id': '0', 'core_port_id': '13', 'index': '13', 'lanes': '40,41,42,43', 'num_voq': '8', 'role': 'Ext', 'speed': '100000', 'admin_status': 'up', 'description': 'ARISTA13T3:Ethernet1', 'fec': 'none', 'mtu': '9100', 'pfc_asym': 'off', 'tpid': '0x8100', 'oper_stat
us': 'up', 'flap_count': '1', 'main': '0x4e,0x4e,0x53,0x4b', 'post1': '0xffffffea,0xffffffea,0xffffffea,0xffffffec', 'post2': '0x0,0x0,0x0,0x0', 'post3': '0x0,0x0,0x0,0x0', 'pre1': '0xfffffffb,0xfffffffb,0xfffffffb,0xfffffffb', 'pre2': '0x0,0x0,0x0,0x0', 'last_up_time': 'Thu Apr 03 16:32:03 2025'}

systemctl restart swss@0.service

sonic-db-cli -n asic0 APPL_DB hgetall "PORT_TABLE:Ethernet96"
{'alias': 'Ethernet13/1', 'asic_port_name': 'Eth96', 'core_id': '0', 'core_port_id': '13', 'index': '13', 'lanes': '40,41,42,43', 'num_voq': '8', 'role': 'Ext', 'speed': '100000', 'admin_status': 'up', 'description': 'ARISTA13T3:Ethernet1', 'fec': 'none', 'mtu': '9100', 'pfc_asym': 'off', 'tpid': '0x8100', 'oper_stat
us': 'up', 'flap_count': '1', 'last_up_time': 'Thu Apr 03 17:41:08 2025'}

With this change (tuning preserved):

sonic-db-cli -n asic0 APPL_DB hgetall "PORT_TABLE:Ethernet96"
{'alias': 'Ethernet13/1', 'asic_port_name': 'Eth96', 'core_id': '0', 'core_port_id': '13', 'index': '13', 'lanes': '40,41,42,43', 'num_voq': '8', 'role': 'Ext', 'speed': '100000', 'admin_status': 'up', 'description': 'ARISTA13T3:Ethernet1', 'fec': 'none', 'mtu': '9100', 'pfc_asym': 'off', 'tpid': '0x8100', 'oper_stat
us': 'up', 'flap_count': '1', 'main': '0x4e,0x4e,0x53,0x4b', 'post1': '0xffffffea,0xffffffea,0xffffffea,0xffffffec', 'post2': '0x0,0x0,0x0,0x0', 'post3': '0x0,0x0,0x0,0x0', 'pre1': '0xfffffffb,0xfffffffb,0xfffffffb,0xfffffffb', 'pre2': '0x0,0x0,0x0,0x0', 'last_up_time': 'Thu Apr 03 16:32:03 2025'}

systemctl restart swss@0.service

sonic-db-cli -n asic0 APPL_DB hgetall "PORT_TABLE:Ethernet96"
{'alias': 'Ethernet13/1', 'asic_port_name': 'Eth96', 'core_id': '0', 'core_port_id': '13', 'index': '13', 'lanes': '40,41,42,43', 'num_voq': '8', 'role': 'Ext', 'speed': '100000', 'admin_status': 'up', 'description': 'ARISTA13T3:Ethernet1', 'fec': 'none', 'mtu': '9100', 'pfc_asym': 'off', 'tpid': '0x8100', 'oper_stat
us': 'up', 'flap_count': '1', 'main': '0x4e,0x4e,0x53,0x4b', 'post1': '0xffffffea,0xffffffea,0xffffffea,0xffffffec', 'post2': '0x0,0x0,0x0,0x0', 'post3': '0x0,0x0,0x0,0x0', 'pre1': '0xfffffffb,0xfffffffb,0xfffffffb,0xfffffffb', 'pre2': '0x0,0x0,0x0,0x0', 'last_up_time': 'Fri Apr 04 17:12:59 2025'}

I also see the message in syslog that pmon is restarted right after the dbs are flushed:

NOTICE root: Chassis db clean up for swss0. Number of SYSTEM_LAG_TABLE entries deleted: 8
NOTICE root: Restarting pmon service...

This is a cast specifically for MSFT-202405 for Arista T2 SKUs, we'll submit a change that applies more generally to all media_settings.json SKUs in master.

@arista-nwolfe arista-nwolfe requested a review from lguohan as a code owner May 2, 2025 22:06
@arlakshm arlakshm merged commit a9a78ca into Azure:202405 May 2, 2025
3 checks passed
bingwang-ms pushed a commit that referenced this pull request Jan 16, 2026
…omatically (#24121)

#### Why I did it
src/sonic-swss-common
```
* 36a8519 - (HEAD -> master, origin/master, origin/HEAD) Add c-api/Rust wrappers for SonicV2Connector (#1080) (21 hours ago) [Qi Luo]
* f9cc568 - Fix trixie compilation and add Trixie to PR pipeline (#1069) (29 hours ago) [Saikrishna Arcot]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants