An issue in the NFNIC driver/firmware might lead to an APD (all paths down) condition in ESXi during a short window of the hot upgrade process on the InfiniBox. This may cause temporary data inaccessibility during the InfiniBox system SW upgrade.
This issue is relevant for environments with UCS hardware running VMware ESXi 6.7 with a Cisco Native FNIC (NFNIC) version earlier than 188.8.131.52.
To determine which version of NFNIC is running:
- ssh root@<esxi FQDN or IP addr>
vmkload_mod -s nfnic
During the hot upgrade process of InfiniBox systems there is a small window of time (approximately 8-12 seconds) where the microcode queues up commands from initiators before sending responses again.
If the following events occur during this small window of time, an APD event is possible:
- NFNIC driver sees a series of SCSI aborts
- A single REPORT_LUNS is sent and doesn't respond in time
When this happens, the NFNIC driver never retries the REPORT_LUNS SCSI command, causing ESXi to report APD.
To recover the VMware ESXi access to data, reboot the ESXi server.
Upgrade Cisco NFNIC to version 184.108.40.206 or above, as listed in the Reference section.