-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Potential Bug with MQTT Client.lvlib:Client.lvclass:Ping.vi #14
Comments
Thanks @agluck28 , I'll investigate and make a unit test for this. It seems we need to ensure we wait for the response, although the exact criteria might need some tweak. By default, we need to receive a response, as per specification: |
I think this also happens if the client is receiving messages with a QOS level of 0 at the same time pings are being sent. Since those messages won't have a message ID, it defaults to 0 I believe, it assumes it is the response to the PINGREQ since that had a message ID of 0 set for the expected response. I think something new will be needed to look for the response for the PINGREQ since the message ID can't be used to tie things together and messages are most likely being received as well. I think an override will be needed to handle the PINGRESP from the broker here. By exposing the PINGRESP from the server to be broadcast out on a new user event within MQTT Client.lvlib:Client.lvclass:Handle Incoming Packets.vi like below (code isn't that clean more of an example) This event can be registered for when sending PINGREQ within MQTT Client.lvlib:Session.Client.lvclass:Send Packet.vi to wait for the expected response from the server. This also allows users of the library to register for the PINGRESP event to take actions if needed. |
I am worried about the implications of this proposed modification. I have not tried yet, but could it be that not reserving the ID when it is 0x0000 would fix this? btw, PINGRESP is a type of "ACK" packets... so not handling it as such might cause more problems than it solves. |
The MQTT Broker can be receiving packets from other publishers that the client sending the ping is subscribed to. If these have a QOS of 0 there is no message ID. I don't think there is anything with TCP delivering out of order packets, but that packets are being sent from the broker to the client other than the PINGRESP. I believe the only reason a 0x0000 packet is being generated is due to a bug with waiting for a response from a PINGREQ. The Session is receiving the standard QOS 0 messages that another client it publishing to the broker. The Session receives this message and sees that it has a message ID of 0 and the PINGREQ registered a message ID of 0 for a response. This then responds back to the broker, I believe due to default values, with a control packet of 0x0000 which is invalid. From my understanding, the broker is only expected to respond with a PINGRESP when receiving a PINGREQ, no other response is expected. Taking a look at the Python MQTT client implementation, and it looks like the client loop handles the PINGREQ and PINGRESP based off of the keep alive time set at client creation. This could potentially be an approach to take here to handle this at a lower level and just bubble up events when the PINGRESP isn't received in a timely manner after the PINGREQ. |
Hi @agluck28 , I'm trying to design a valid test case to reproduce the problem. Do you think the following sequence describes the issue you are seeing?
Since a typical PingResp arrives with >100ms latency against the mosquitto test server, it feels like this should be fairly easy to reproduce by more than pure luck, correct? EDIT: Yes, that works... I reproduced the problem on the first Ping packet sent. |
That is very close to what is happening, though I think the disconnect is happening due to the client responding back with an invalid packet when it receives data after a PINGREQ. This was very reproducible when I was testing, I had an immediate disconnect after every ping sent. Do like working issues that always reproduce vs ones that are subject to race conditions. |
Just for terminology: I think the problem lies in the Session which stores the packetID and "consumes" the incoming acknowledgements. Of all the packets, only CONNECT, CONNACK, DISCONNECT, PINGREQ and PINGRESP do not have packet IDs... with the PUBLISH at QoS = 0. Because PINGREQ and PINGRESP are not the only packets that do not have IDs "while a session is ongoing", you have the correct instinct that they will need to be treated differently to decouple them from the PUBLISH packets that do not have IDs. My earlier tests had all been to send Pings while no communication is active, so I did not see the problem of incoming packets being received while waiting for the ping responses. I'll check in on your suggestion to use the keep-alive time as the virtual packet ID. If I keep PUBLISH = 0, and swap a virtual ID for PINGREQ/PINGRESP (even an arbitrary one such as 0xFFFF), it might solve the problem. If that`s the case, then I assume I can circumscribe the solution space to the Session.Client class. |
Oh, I think I got an idea to solve this elegantly... instead of using the packetID (U16) as the key for storing and mapping acknowledgements in the Session class, I'll use a U32 that concatenates the packetID with the immutable packetHeader of the expected response (U32). I'll create a filter to ignore the packetID of Publish packets at QoS = 0 so we do not grow the ID map, and that should correctly map any packet that expects an ACK. |
Has this issue been addressed? I am getting client disconnects, even though I am sending periodic pings to the server. My publisher is sending QoS 0 packets only. I have a test VI that disconnects randomly at different times. Sometimes it will run all night without a disconnect, other times it will disconnect in six minutes. |
When implementing ping to other MQTT brokers, tested on Mosquitto, VerneMQ and HiveMQ, once the LabVIEW client sent the Ping, the broker would then disconnect the client due to an invalid message from being received.
Looking into the code, and I think there is an issue with the Ping.vi expecting a response. The response implementation uses the message ID of the packet to tie things together, however the PINGREQ type has no message ID and always defaults to a value of 0. This appears to cause the Client to respond back to the broker when it receives a PINGRESP packet, or really any packet, with a payload of 0x0000 which is an unknown packet type causing the disconnect.
By passing in the False flag for the Wait for ack input, this allows the Pings to be sent and the broker to respond as expected.
The text was updated successfully, but these errors were encountered: