nRF70: TX packets silently dropped #88857
Labels
area: Wi-Fi
Wi-Fi
bug
The issue is a bug, or the PR is fixing a bug
platform: nRF
Nordic nRFx
priority: low
Low impact/importance bug
Describe the bug
The TX path of the nRF70 WiFi driver silently* drops TX packets if too many packets are queued at once.
https://github.com/zephyrproject-rtos/nrf_wifi/blob/e2c9a783448d919d191591be04ae9f3dd0643027/fw_if/umac_if/src/system/tx.c#L1102
* an error code is returned, but in practice is ignored all the way back up the stack
To Reproduce
Run
zephyr/samples/net/zperf
on anRF7002dk
, with a reducedCONFIG_NRF70_MAX_TX_PENDING_QLEN
or a very large "rate" parameter.Expected behavior
Packets that have been properly created, allocated, and pushed all the way down to the lowest layers of the Wi-Fi driver shouldn't be arbitrarily dropped just because a large number of packets have been previously queued.
Impact
Dropped packets, broken protocols, etc.
This also leads to super un-intuitive behavior such as increasing the heap size leading to increased packet loss (since more packets can be pending with the extra memory).
Additional context
I'm not sure what the goal of
NRF70_MAX_TX_PENDING_QLEN
is supposed to be, since the memory is already allocated (mostly) at this point, and dropping packets is infinitely worse than the packet simply being sent later than you might otherwise expect.This problem gets worse at lower SPI/QSPI bus speeds, since the ability of the driver to clear out packets is reduced.
The application is also totally unaware that this is happening, and has no way to implement any backoff mechanism.
The two current workarounds:
k_sleep
between packet sends,in order to limit the packets fed to the driverCONFIG_NRF70_MAX_TX_PENDING_QLEN
to the point the option stops being releventThe second option can also lead to deadlocks, since
nrf_wifi_utils_q_enqueue
is trying to allocate memory with thevif_lock
held (see #88781).The text was updated successfully, but these errors were encountered: