Skip to content

PFC Watchdog

Marian Pritsak edited this page Sep 11, 2017 · 6 revisions

Overview

PFC watchdog is designed to detect and mitigate PFC storm received for each port. PFC pause frames is used in lossless Ethernet to pause the link partner from sending packets. Such back-pressure mechanism could propagate to the whole network and cause the network stop forwarding traffic. PFC watchdog is to detect abnormal back-pressure caused by receiving excessive PFC pause frames, and mitigate such situation by disable PFC caused pause temporarily. PFC watchdog has three function blocks, i.e. detection, mitigation and restoration.

Functional Specification

PFC storm detection

The PFC storm detection is for a switch to detect a lossless queue is receiving PFC storm from its link partner and the queue is in a paused state over T0 amount of time. Even when the queue is empty, as soon as the duration for a queue in paused state exceeds T0 amount of time, the watchdog should detect such storm. T0 is a port level parameter. The detection needs to enable/disable at per port level. Such detection mechanism is only available for lossless queue. By default, the detection mechanism is disabled. T0 should be on the scale of hundred milliseconds.

PFC storm mitigation

Once PFC storm is detected on a queue, the watchdog can then have two actions, drop and forward at per queue level. When drop action is selected, following actions need to be implemented.

  • All existing packets in the output queue are discarded
  • All subsequent packets destine to the output queue are discarded
  • all subsequent packets received by the corresponding priority group of this queue are discarded including the pause frames received. As a result, the switch should not generate any pause frame to its neighbor due to congestion of this output queue.

When forward action is selected, following actions need to be implemented.

  • the queue no longer honor the PFC frames received. All packets destined to the queue are forwarded as well as those packets that were in the queue.

The default action is drop.

PFC storm restoration

The watchdog should continue count the PFC frames received on the queue. If there is no PFC frame received over T1 period. Then, re-enable the PFC on the queue and stop dropping packets if the previous mitigation was drop. T1 is port level parameter. T1 should be on the scale of hundred milliseconds.

Logging Requirement

Print out Notice level log when PFC storm is detected and restored. When PFC is restored, print out the dropped/forwarded packets for both output queue and ingress priority group.

PFC WD counters

Keep database entries for following counter values:

  • Queue deadlock counter
  • Queue Restore counter
  • Number of Tx packets dropped due to PFC deadlock
  • Number of Rx packets dropped due to PFC deadlcok
  • Number of Tx packets transmitted during deadlock (Forward action)

Testing Requirement

To be filled in the design spec.

Clone this wiki locally