Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[xcvrd] xcvrd crashes during breakout port #18893

Open
HaiHuang008 opened this issue May 7, 2024 · 6 comments · May be fixed by sonic-net/sonic-platform-daemons#500
Open

[xcvrd] xcvrd crashes during breakout port #18893

HaiHuang008 opened this issue May 7, 2024 · 6 comments · May be fixed by sonic-net/sonic-platform-daemons#500
Labels
Help Wanted 🆘 Triaged this issue has been triaged

Comments

@HaiHuang008
Copy link

Description

The ports of the device are all connected to OSFP cables.
The first time a port is breakout, xcvrd crashes out and then it initializes all ports again.

xcvrd crashes:

May  7 18:48:08.371367 sonic WARNING pmon#xcvrd: $$$ Ethernet0 handle_port_update_event() : op=SET DB:CONFIG_DB Table:PORT fvp {'alias': 'Eth1/1', 'index': '1', 'lanes': '9', 'speed': '100000'}
May  7 18:48:08.371500 sonic WARNING pmon#xcvrd: *** Ethernet0CONFIG_DBPORT handle_port_update_event() fvp {'alias': 'Eth1/1', 'index': '1', 'lanes': '9', 'speed': '100000', 'key': 'Ethernet0', 'asic_id': 0, 'op': 'SET'}
May  7 18:48:08.371322 sonic INFO ConfigMgmt: Writing in Config DB
May  7 18:48:08.371505 sonic INFO ConfigMgmt: Write in DB: {'PORT': {'Ethernet0': {'alias': 'Eth1/1', 'lanes': '9', 'speed': '100000', 'index': '1'}, 'Ethernet1': {'alias': 'Eth1/2', 'lanes': '10', 'speed': '100000', 'index': '1'}, 'Ethernet2': {'alias': 'Eth1/3', 'lanes': '11', 'speed': '100000', 'index': '1'}, 'Ethernet3': {'alias': 'Eth1/4', 'lanes': '12', 'speed': '100000', 'index': '1'}, 'Ethernet4': {'alias': 'Eth1/5', 'lanes': '13', 'speed': '100000', 'index': '1'}, 'Ethernet5': {'alias': 'Eth1/6', 'lanes': '14', 'speed': '100000', 'index': '1'}, 'Ethernet6': {'alias': 'Eth1/7', 'lanes': '15', 'speed': '100000', 'index': '1'}, 'Ethernet7': {'alias': 'Eth1/8', 'lanes': '16', 'speed': '100000', 'index': '1'}}}
May  7 18:48:08.395578 sonic WARNING pmon#xcvrd: $$$ Ethernet1 handle_port_update_event() : op=SET DB:CONFIG_DB Table:PORT fvp {'alias': 'Eth1/2', 'index': '1', 'lanes': '10', 'speed': '100000'}
May  7 18:48:08.395578 sonic WARNING pmon#xcvrd: $$$ Ethernet2 handle_port_update_event() : op=SET DB:CONFIG_DB Table:PORT fvp {'alias': 'Eth1/3', 'index': '1', 'lanes': '11', 'speed': '100000'}
May  7 18:48:08.395578 sonic WARNING pmon#xcvrd: $$$ Ethernet3 handle_port_update_event() : op=SET DB:CONFIG_DB Table:PORT fvp {'alias': 'Eth1/4', 'index': '1', 'lanes': '12', 'speed': '100000'}
May  7 18:48:08.395578 sonic WARNING pmon#xcvrd: $$$ Ethernet4 handle_port_update_event() : op=SET DB:CONFIG_DB Table:PORT fvp {'alias': 'Eth1/5', 'index': '1', 'lanes': '13', 'speed': '100000'}
May  7 18:48:08.395578 sonic WARNING pmon#xcvrd: $$$ Ethernet5 handle_port_update_event() : op=SET DB:CONFIG_DB Table:PORT fvp {'alias': 'Eth1/6', 'index': '1', 'lanes': '14', 'speed': '100000'}
May  7 18:48:08.395578 sonic WARNING pmon#xcvrd: $$$ Ethernet6 handle_port_update_event() : op=SET DB:CONFIG_DB Table:PORT fvp {'alias': 'Eth1/7', 'index': '1', 'lanes': '15', 'speed': '100000'}
May  7 18:48:08.395630 sonic WARNING pmon#xcvrd: $$$ Ethernet7 handle_port_update_event() : op=SET DB:CONFIG_DB Table:PORT fvp {'alias': 'Eth1/8', 'index': '1', 'lanes': '16', 'speed': '100000'}
May  7 18:48:08.395892 sonic WARNING pmon#xcvrd: *** Ethernet1CONFIG_DBPORT handle_port_update_event() fvp {'alias': 'Eth1/2', 'index': '1', 'lanes': '10', 'speed': '100000', 'key': 'Ethernet1', 'asic_id': 0, 'op': 'SET'}
May  7 18:48:08.395892 sonic WARNING pmon#xcvrd: *** Ethernet2CONFIG_DBPORT handle_port_update_event() fvp {'alias': 'Eth1/3', 'index': '1', 'lanes': '11', 'speed': '100000', 'key': 'Ethernet2', 'asic_id': 0, 'op': 'SET'}
May  7 18:48:08.395892 sonic WARNING pmon#xcvrd: *** Ethernet3CONFIG_DBPORT handle_port_update_event() fvp {'alias': 'Eth1/4', 'index': '1', 'lanes': '12', 'speed': '100000', 'key': 'Ethernet3', 'asic_id': 0, 'op': 'SET'}
May  7 18:48:08.395892 sonic WARNING pmon#xcvrd: *** Ethernet4CONFIG_DBPORT handle_port_update_event() fvp {'alias': 'Eth1/5', 'index': '1', 'lanes': '13', 'speed': '100000', 'key': 'Ethernet4', 'asic_id': 0, 'op': 'SET'}
May  7 18:48:08.395892 sonic WARNING pmon#xcvrd: *** Ethernet5CONFIG_DBPORT handle_port_update_event() fvp {'alias': 'Eth1/6', 'index': '1', 'lanes': '14', 'speed': '100000', 'key': 'Ethernet5', 'asic_id': 0, 'op': 'SET'}
May  7 18:48:08.395892 sonic WARNING pmon#xcvrd: *** Ethernet6CONFIG_DBPORT handle_port_update_event() fvp {'alias': 'Eth1/7', 'index': '1', 'lanes': '15', 'speed': '100000', 'key': 'Ethernet6', 'asic_id': 0, 'op': 'SET'}
May  7 18:48:08.395892 sonic WARNING pmon#xcvrd: *** Ethernet7CONFIG_DBPORT handle_port_update_event() fvp {'alias': 'Eth1/8', 'index': '1', 'lanes': '16', 'speed': '100000', 'key': 'Ethernet7', 'asic_id': 0, 'op': 'SET'}
May  7 18:48:08.403956 sonic ERR pmon#xcvrd: Exception occured at CmisManagerTask thread due to KeyError(None)
May  7 18:48:08.404851 sonic ERR pmon#xcvrd: Traceback (most recent call last):
May  7 18:48:08.404851 sonic ERR pmon#xcvrd:   File "/usr/local/lib/python3.9/dist-packages/xcvrd/xcvrd.py", line 1718, in run
May  7 18:48:08.404851 sonic ERR pmon#xcvrd:     self.task_worker()
May  7 18:48:08.404851 sonic ERR pmon#xcvrd:   File "/usr/local/lib/python3.9/dist-packages/xcvrd/xcvrd.py", line 1428, in task_worker
May  7 18:48:08.404888 sonic ERR pmon#xcvrd:     self.port_dict[lport]['host_tx_ready'] = self.get_host_tx_status(lport)
May  7 18:48:08.404888 sonic ERR pmon#xcvrd:   File "/usr/local/lib/python3.9/dist-packages/xcvrd/xcvrd.py", line 1300, in get_host_tx_status
May  7 18:48:08.405015 sonic ERR pmon#xcvrd:     state_port_tbl = self.xcvr_table_helper.get_state_port_tbl(asic_index)
May  7 18:48:08.405015 sonic ERR pmon#xcvrd:   File "/usr/local/lib/python3.9/dist-packages/xcvrd/xcvrd.py", line 2632, in get_state_port_tbl
May  7 18:48:08.405015 sonic ERR pmon#xcvrd:     return self.state_port_tbl[asic_id]
May  7 18:48:08.405015 sonic ERR pmon#xcvrd: KeyError: None
May  7 18:48:08.405015 sonic ERR pmon#xcvrd: Xcvrd: exception found at child thread CmisManagerTask due to KeyError(None)
May  7 18:48:08.405033 sonic ERR pmon#xcvrd: Exiting main loop as child thread raised exception!
May  7 18:48:08.414110 sonic INFO pmon#supervisord 2024-05-07 18:48:08,413 INFO exited: xcvrd (terminated by SIGKILL; not expected)

Due to xcvrd crashes, xcvrd is initialized again, and TRANSCEIVER INFO will be traversed and filled again.
As a result, the Media type display of the current port is very slow (about two minutes) in show interface status, and it will also affect other ports.

Steps to reproduce the issue:

  1. OSFP cables connected to all ports
  2. From the current port mode of 2x400GBreakout to 8x100G
    sudo config interface breakout Ethernet0 8x100G -f -y
  3. show interfaces status
admin@sonic:~$ show interfaces status
  Interface            Lanes    Speed    MTU    FEC    Alias    Vlan    Oper    Admin                           Type    Asym PFC
-----------  ---------------  -------  -----  -----  -------  ------  ------  -------  -----------------------------  ----------
  Ethernet0                9     100G   9100    N/A   Eth1/1  routed    down     down  OSFP 8X Pluggable Transceiver         N/A
  Ethernet1               10     100G   9100    N/A   Eth1/2  routed    down     down  OSFP 8X Pluggable Transceiver         N/A
  Ethernet2               11     100G   9100    N/A   Eth1/3  routed    down     down  OSFP 8X Pluggable Transceiver         N/A
  Ethernet3               12     100G   9100    N/A   Eth1/4  routed    down     down  OSFP 8X Pluggable Transceiver         N/A
  Ethernet4               13     100G   9100    N/A   Eth1/5  routed    down     down  OSFP 8X Pluggable Transceiver         N/A
  Ethernet5               14     100G   9100    N/A   Eth1/6  routed    down     down  OSFP 8X Pluggable Transceiver         N/A
  Ethernet6               15     100G   9100    N/A   Eth1/7  routed    down     down  OSFP 8X Pluggable Transceiver         N/A
  Ethernet7               16     100G   9100    N/A   Eth1/8  routed    down     down  OSFP 8X Pluggable Transceiver         N/A
  Ethernet8          1,2,3,4     400G   9100    N/A   Eth2/1  routed      up       up  OSFP 8X Pluggable Transceiver         N/A
 Ethernet12          5,6,7,8     400G   9100    N/A   Eth2/2  routed      up       up  OSFP 8X Pluggable Transceiver         N/A

admin@sonic:~$ show interfaces breakout current-mode
+-------------+-------------------------+
| Interface   | Current Breakout Mode   |
+=============+=========================+
| Ethernet0   | 8x100G                  |
+-------------+-------------------------+
| Ethernet8   | 2x400G                  |
+-------------+-------------------------+
| Ethernet16  | 8x100G                  |
+-------------+-------------------------+
| Ethernet24  | 2x400G                  |
+-------------+-------------------------+
| Ethernet32  | 2x400G                  |
+-------------+-------------------------+

  1. Check logs

Describe the results you received:

May  7 18:48:08.371367 sonic WARNING pmon#xcvrd: $$$ Ethernet0 handle_port_update_event() : op=SET DB:CONFIG_DB Table:PORT fvp {'alias': 'Eth1/1', 'index': '1', 'lanes': '9', 'speed': '100000'}
May  7 18:48:08.371500 sonic WARNING pmon#xcvrd: *** Ethernet0CONFIG_DBPORT handle_port_update_event() fvp {'alias': 'Eth1/1', 'index': '1', 'lanes': '9', 'speed': '100000', 'key': 'Ethernet0', 'asic_id': 0, 'op': 'SET'}
May  7 18:48:08.371322 sonic INFO ConfigMgmt: Writing in Config DB
May  7 18:48:08.371505 sonic INFO ConfigMgmt: Write in DB: {'PORT': {'Ethernet0': {'alias': 'Eth1/1', 'lanes': '9', 'speed': '100000', 'index': '1'}, 'Ethernet1': {'alias': 'Eth1/2', 'lanes': '10', 'speed': '100000', 'index': '1'}, 'Ethernet2': {'alias': 'Eth1/3', 'lanes': '11', 'speed': '100000', 'index': '1'}, 'Ethernet3': {'alias': 'Eth1/4', 'lanes': '12', 'speed': '100000', 'index': '1'}, 'Ethernet4': {'alias': 'Eth1/5', 'lanes': '13', 'speed': '100000', 'index': '1'}, 'Ethernet5': {'alias': 'Eth1/6', 'lanes': '14', 'speed': '100000', 'index': '1'}, 'Ethernet6': {'alias': 'Eth1/7', 'lanes': '15', 'speed': '100000', 'index': '1'}, 'Ethernet7': {'alias': 'Eth1/8', 'lanes': '16', 'speed': '100000', 'index': '1'}}}
May  7 18:48:08.395578 sonic WARNING pmon#xcvrd: $$$ Ethernet1 handle_port_update_event() : op=SET DB:CONFIG_DB Table:PORT fvp {'alias': 'Eth1/2', 'index': '1', 'lanes': '10', 'speed': '100000'}
May  7 18:48:08.395578 sonic WARNING pmon#xcvrd: $$$ Ethernet2 handle_port_update_event() : op=SET DB:CONFIG_DB Table:PORT fvp {'alias': 'Eth1/3', 'index': '1', 'lanes': '11', 'speed': '100000'}
May  7 18:48:08.395578 sonic WARNING pmon#xcvrd: $$$ Ethernet3 handle_port_update_event() : op=SET DB:CONFIG_DB Table:PORT fvp {'alias': 'Eth1/4', 'index': '1', 'lanes': '12', 'speed': '100000'}
May  7 18:48:08.395578 sonic WARNING pmon#xcvrd: $$$ Ethernet4 handle_port_update_event() : op=SET DB:CONFIG_DB Table:PORT fvp {'alias': 'Eth1/5', 'index': '1', 'lanes': '13', 'speed': '100000'}
May  7 18:48:08.395578 sonic WARNING pmon#xcvrd: $$$ Ethernet5 handle_port_update_event() : op=SET DB:CONFIG_DB Table:PORT fvp {'alias': 'Eth1/6', 'index': '1', 'lanes': '14', 'speed': '100000'}
May  7 18:48:08.395578 sonic WARNING pmon#xcvrd: $$$ Ethernet6 handle_port_update_event() : op=SET DB:CONFIG_DB Table:PORT fvp {'alias': 'Eth1/7', 'index': '1', 'lanes': '15', 'speed': '100000'}
May  7 18:48:08.395630 sonic WARNING pmon#xcvrd: $$$ Ethernet7 handle_port_update_event() : op=SET DB:CONFIG_DB Table:PORT fvp {'alias': 'Eth1/8', 'index': '1', 'lanes': '16', 'speed': '100000'}
May  7 18:48:08.395892 sonic WARNING pmon#xcvrd: *** Ethernet1CONFIG_DBPORT handle_port_update_event() fvp {'alias': 'Eth1/2', 'index': '1', 'lanes': '10', 'speed': '100000', 'key': 'Ethernet1', 'asic_id': 0, 'op': 'SET'}
May  7 18:48:08.395892 sonic WARNING pmon#xcvrd: *** Ethernet2CONFIG_DBPORT handle_port_update_event() fvp {'alias': 'Eth1/3', 'index': '1', 'lanes': '11', 'speed': '100000', 'key': 'Ethernet2', 'asic_id': 0, 'op': 'SET'}
May  7 18:48:08.395892 sonic WARNING pmon#xcvrd: *** Ethernet3CONFIG_DBPORT handle_port_update_event() fvp {'alias': 'Eth1/4', 'index': '1', 'lanes': '12', 'speed': '100000', 'key': 'Ethernet3', 'asic_id': 0, 'op': 'SET'}
May  7 18:48:08.395892 sonic WARNING pmon#xcvrd: *** Ethernet4CONFIG_DBPORT handle_port_update_event() fvp {'alias': 'Eth1/5', 'index': '1', 'lanes': '13', 'speed': '100000', 'key': 'Ethernet4', 'asic_id': 0, 'op': 'SET'}
May  7 18:48:08.395892 sonic WARNING pmon#xcvrd: *** Ethernet5CONFIG_DBPORT handle_port_update_event() fvp {'alias': 'Eth1/6', 'index': '1', 'lanes': '14', 'speed': '100000', 'key': 'Ethernet5', 'asic_id': 0, 'op': 'SET'}
May  7 18:48:08.395892 sonic WARNING pmon#xcvrd: *** Ethernet6CONFIG_DBPORT handle_port_update_event() fvp {'alias': 'Eth1/7', 'index': '1', 'lanes': '15', 'speed': '100000', 'key': 'Ethernet6', 'asic_id': 0, 'op': 'SET'}
May  7 18:48:08.395892 sonic WARNING pmon#xcvrd: *** Ethernet7CONFIG_DBPORT handle_port_update_event() fvp {'alias': 'Eth1/8', 'index': '1', 'lanes': '16', 'speed': '100000', 'key': 'Ethernet7', 'asic_id': 0, 'op': 'SET'}
May  7 18:48:08.403956 sonic ERR pmon#xcvrd: Exception occured at CmisManagerTask thread due to KeyError(None)
May  7 18:48:08.404851 sonic ERR pmon#xcvrd: Traceback (most recent call last):
May  7 18:48:08.404851 sonic ERR pmon#xcvrd:   File "/usr/local/lib/python3.9/dist-packages/xcvrd/xcvrd.py", line 1718, in run
May  7 18:48:08.404851 sonic ERR pmon#xcvrd:     self.task_worker()
May  7 18:48:08.404851 sonic ERR pmon#xcvrd:   File "/usr/local/lib/python3.9/dist-packages/xcvrd/xcvrd.py", line 1428, in task_worker
May  7 18:48:08.404888 sonic ERR pmon#xcvrd:     self.port_dict[lport]['host_tx_ready'] = self.get_host_tx_status(lport)
May  7 18:48:08.404888 sonic ERR pmon#xcvrd:   File "/usr/local/lib/python3.9/dist-packages/xcvrd/xcvrd.py", line 1300, in get_host_tx_status
May  7 18:48:08.405015 sonic ERR pmon#xcvrd:     state_port_tbl = self.xcvr_table_helper.get_state_port_tbl(asic_index)
May  7 18:48:08.405015 sonic ERR pmon#xcvrd:   File "/usr/local/lib/python3.9/dist-packages/xcvrd/xcvrd.py", line 2632, in get_state_port_tbl
May  7 18:48:08.405015 sonic ERR pmon#xcvrd:     return self.state_port_tbl[asic_id]
May  7 18:48:08.405015 sonic ERR pmon#xcvrd: KeyError: None
May  7 18:48:08.405015 sonic ERR pmon#xcvrd: Xcvrd: exception found at child thread CmisManagerTask due to KeyError(None)
May  7 18:48:08.405033 sonic ERR pmon#xcvrd: Exiting main loop as child thread raised exception!
May  7 18:48:08.414110 sonic INFO pmon#supervisord 2024-05-07 18:48:08,413 INFO exited: xcvrd (terminated by SIGKILL; not expected)
May  7 18:48:09.540293 sonic NOTICE pmon#xcvrd[4227]: Starting up...
May  7 18:48:09.540293 sonic NOTICE pmon#xcvrd[4227]: XCVRD INIT: Start daemon init...
May  7 18:48:09.670233 sonic NOTICE pmon#xcvrd[4227]: XCVRD INIT: Wait for port config is done
May  7 18:48:09.673361 sonic NOTICE pmon#xcvrd[4227]: XCVRD INIT: After port config is done
May  7 18:48:09.679986 sonic NOTICE pmon#xcvrd[4227]: Start daemon main loop with thread count 3
May  7 18:48:09.680067 sonic NOTICE pmon#xcvrd[4227]: Started thread CmisManagerTask
May  7 18:48:09.680067 sonic NOTICE pmon#xcvrd[4227]: Started thread DomInfoUpdateTask
May  7 18:48:09.680077 sonic NOTICE pmon#xcvrd[4227]: Started thread SfpStateUpdateTask

Describe the results you expected:

Output of show version:

admin@sonic:~$ show version

SONiC Software Version: SONiC.DS4101_V2.0.0_1
SONiC OS Version: 11
Distribution: Debian 11.7
Kernel: 5.10.0-23-2-amd64
Build commit: 67ede2e82
Build date: Wed Apr 10 07:05:46 UTC 2024
Built by: autobuild@AZUHPSP10

Platform: x86_64-cel_ds4101-r0
HwSKU: cel_ds4101
ASIC: broadcom
ASIC Count: 1
Serial Number: R4030F2B063A25GD200026
Model Number: R4030-F9001-02
Hardware Revision: 6
Uptime: 19:26:13 up  1:38,  2 users,  load average: 1.45, 1.52, 1.45
Date: Tue 07 May 2024 19:26:13

Docker images:
REPOSITORY                    TAG               IMAGE ID       SIZE
docker-dhcp-relay             latest            43f5d92d0d5f   309MB
docker-macsec                 latest            063cb43bd118   328MB
docker-orchagent              DS4101_V2.0.0_1   88f1d11531ea   338MB
docker-orchagent              latest            88f1d11531ea   338MB
docker-fpm-frr                DS4101_V2.0.0_1   25fef5a4a745   357MB
docker-fpm-frr                latest            25fef5a4a745   357MB
docker-sflow                  DS4101_V2.0.0_1   7f2b88118acd   327MB
docker-sflow                  latest            7f2b88118acd   327MB
docker-nat                    DS4101_V2.0.0_1   da64867dabff   329MB
docker-nat                    latest            da64867dabff   329MB
docker-eventd                 DS4101_V2.0.0_1   1a953e81a85b   300MB
docker-eventd                 latest            1a953e81a85b   300MB
docker-snmp                   DS4101_V2.0.0_1   9d41791169a6   340MB
docker-snmp                   latest            9d41791169a6   340MB
docker-platform-monitor       DS4101_V2.0.0_1   b1fc016974ff   421MB
docker-platform-monitor       latest            b1fc016974ff   421MB
docker-teamd                  DS4101_V2.0.0_1   d97ecda1ae11   326MB
docker-teamd                  latest            d97ecda1ae11   326MB
docker-lldp                   DS4101_V2.0.0_1   0e05f15eea2b   343MB
docker-lldp                   latest            0e05f15eea2b   343MB
docker-mux                    DS4101_V2.0.0_1   252b2ba2def7   349MB
docker-mux                    latest            252b2ba2def7   349MB
docker-sonic-gnmi             DS4101_V2.0.0_1   2a40110cc077   388MB
docker-sonic-gnmi             latest            2a40110cc077   388MB
docker-database               DS4101_V2.0.0_1   79d18c8282dd   301MB
docker-database               latest            79d18c8282dd   301MB
docker-gbsyncd-broncos        DS4101_V2.0.0_1   5cea29cd8028   351MB
docker-gbsyncd-broncos        latest            5cea29cd8028   351MB
docker-syncd-brcm             DS4101_V2.0.0_1   7e5381b11d8f   675MB
docker-syncd-brcm             latest            7e5381b11d8f   675MB
docker-gbsyncd-credo          DS4101_V2.0.0_1   c27a150fd640   323MB
docker-gbsyncd-credo          latest            c27a150fd640   323MB
docker-router-advertiser      DS4101_V2.0.0_1   28bbd623ce8a   301MB
docker-router-advertiser      latest            28bbd623ce8a   301MB
docker-sonic-mgmt-framework   DS4101_V2.0.0_1   303355e5ea69   417MB
docker-sonic-mgmt-framework   latest            303355e5ea69   417MB

admin@sonic:~$

Output of show techsupport:

(paste your output here or download and attach the file here )

Additional information you deem important (e.g. issue happens only occasionally):

@neethajohn
Copy link
Contributor

@prgeor , can you please take a look?

@neethajohn neethajohn added Triaged this issue has been triaged Help Wanted 🆘 labels May 8, 2024
ishidawataru added a commit to ishidawataru/sonic-platform-daemons that referenced this issue Jun 14, 2024
This commit fixes DPB support with CMIS transceivers.

CmisManagerTask's port_dict and port_mapping must be updated
according to the port add/remove events.
Sinse CmisManagerTask uses PortChangeObserver, which emits
PORT_SET/PORT_DEL events, not PORT_ADD/PORT_REMOVE,
PortMapping is updated to handle these events.

fixes sonic-net/sonic-buildimage#18893

Signed-off-by: Wataru Ishida <[email protected]>
@ishidawataru
Copy link
Collaborator

Hi, I reproduced this issue on the VS environment.

The issue was that xcvrd's CmitManagerTask was not taking care of port addition and removal events, that occur from DPB configuration change.

sonic-net/sonic-platform-daemons#500 fixes the issue on my environment.

Since I don't have access to the real hardware that supports DPB and CMIS transceivers, I enhanced the VS environment to support DPB and run xcvrd for reproduction and testing.

@HaiHuang008 It would be great if you could check whether this fix resolves the issue on your platform.

@HaiHuang008
Copy link
Author

@ishidawataru Thanks for your fix, this error no longer appears. But another error appears, causing xcvrd to crash.

image

ishidawataru added a commit to ishidawataru/sonic-platform-daemons that referenced this issue Jun 19, 2024
This commit fixes DPB support with CMIS transceivers.

CmisManagerTask's `port_dict` must be updated according to the port add/remove events.
This commit removes the `port_mapping` field from CmisManagerTask as
`port_mapping` was mostly used just for storing `asic_id` information
and that can be simply done by `port_dict` instead.

Added a helper method `get_asic_id()` method to CmisManagerTask for
getting `asic_id` from `logical_port`.

fixes sonic-net/sonic-buildimage#18893

Signed-off-by: Wataru Ishida <[email protected]>
@ishidawataru
Copy link
Collaborator

@HaiHuang008 Thanks for testing and I'm sorry that the original fix didn't work. I think I could fix the problem of the original fix.
I updated the PR
Could you try again?

@HaiHuang008
Copy link
Author

@ishidawataru Thanks again. This issue is resolved. No xcvrd crash when port breakout.
Check_port_breakout.txt

ishidawataru added a commit to ishidawataru/sonic-platform-daemons that referenced this issue Jun 21, 2024
This commit fixes DPB support with CMIS transceivers.

CmisManagerTask's `port_dict` must be updated according to the port add/remove events.
This commit removes the `port_mapping` field from CmisManagerTask as
`port_mapping` was mostly used just for storing `asic_id` information
and that can be simply done by `port_dict` instead.

Added a helper method `get_asic_id()` method to CmisManagerTask for
getting `asic_id` from `logical_port`.

fixes sonic-net/sonic-buildimage#18893

Signed-off-by: Wataru Ishida <[email protected]>
@ishidawataru
Copy link
Collaborator

@HaiHuang008 Thanks for testing. I'll try to get the PR merged.

ishidawataru added a commit to ishidawataru/sonic-platform-daemons that referenced this issue Jun 24, 2024
This commit fixes DPB support with CMIS transceivers.

CmisManagerTask's `port_dict` must be updated according to the port add/remove events.
This commit removes the `port_mapping` field from CmisManagerTask as
`port_mapping` was mostly used just for storing `asic_id` information
and that can be simply done by `port_dict` instead.

Added a helper method `get_asic_id()` method to CmisManagerTask for
getting `asic_id` from `logical_port`.

fixes sonic-net/sonic-buildimage#18893

Signed-off-by: Wataru Ishida <[email protected]>
ishidawataru added a commit to ishidawataru/sonic-platform-daemons that referenced this issue Jun 24, 2024
This commit fixes DPB support with CMIS transceivers.

CmisManagerTask's `port_dict` must be updated according to the port add/remove events.
This commit removes the `port_mapping` field from CmisManagerTask as
`port_mapping` was mostly used just for storing `asic_id` information
and that can be simply done by `port_dict` instead.

Added a helper method `get_asic_id()` method to CmisManagerTask for
getting `asic_id` from `logical_port`.

fixes sonic-net/sonic-buildimage#18893

Signed-off-by: Wataru Ishida <[email protected]>
ishidawataru added a commit to ishidawataru/sonic-platform-daemons that referenced this issue Jun 25, 2024
This commit fixes DPB support with CMIS transceivers.

CmisManagerTask's `port_dict` must be updated according to the port add/remove events.
This commit removes the `port_mapping` field from CmisManagerTask as
`port_mapping` was mostly used just for storing `asic_id` information
and that can be simply done by `port_dict` instead.

Added a helper method `get_asic_id()` method to CmisManagerTask for
getting `asic_id` from `logical_port`.

fixes sonic-net/sonic-buildimage#18893

Signed-off-by: Wataru Ishida <[email protected]>
ishidawataru added a commit to ishidawataru/sonic-platform-daemons that referenced this issue Jul 23, 2024
This commit fixes DPB support with CMIS transceivers.

CmisManagerTask's `port_dict` must be updated according to the port add/remove events.
This commit removes the `port_mapping` field from CmisManagerTask as
`port_mapping` was mostly used just for storing `asic_id` information
and that can be simply done by `port_dict` instead.

Added a helper method `get_asic_id()` method to CmisManagerTask for
getting `asic_id` from `logical_port`.

fixes sonic-net/sonic-buildimage#18893

Signed-off-by: Wataru Ishida <[email protected]>
ishidawataru added a commit to ishidawataru/sonic-platform-daemons that referenced this issue Jul 23, 2024
This commit fixes DPB support with CMIS transceivers.

CmisManagerTask's `port_dict` must be updated according to the port add/remove events.
This commit removes the `port_mapping` field from CmisManagerTask as
`port_mapping` was mostly used just for storing `asic_id` information
and that can be simply done by `port_dict` instead.

Added a helper method `get_asic_id()` method to CmisManagerTask for
getting `asic_id` from `logical_port`.

fixes sonic-net/sonic-buildimage#18893

Signed-off-by: Wataru Ishida <[email protected]>
ishidawataru added a commit to ishidawataru/sonic-platform-daemons that referenced this issue Sep 17, 2024
CmisManagerTask's `port_dict` and `port_mapping` must be updated
according to the port add/remove events.

Before this commit, `port_mapping` is only intialized when
CmisManagerTask is initialized and not updated after that,
which was causing KeyError exception when DBP is used.
(sonic-net/sonic-buildimage#18893)

This commit removes the `port_mapping` field from CmisManagerTask as
`port_mapping` was used just for storing `asic_id` information
and that can be simply done by `port_dict` instead.

Also, this commit updates `port_dict` accoding to the port add/remove
events to support DPB.

Signed-off-by: Wataru Ishida <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Help Wanted 🆘 Triaged this issue has been triaged
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants