Wired retain message bahaviour on reconnection of mosquitto bridge

Hello together,

I’ve observed some unexpected behavior on my mosquitto bridge when the bridge was disconnected and reconnected again. Simple scenario:
client1 ↔ Broker Pi ↔ Broker local ↔ client2

The clients are also simple for demonstration:
client2 is publishing something on a “desired” topic, which client1 is subscribing to. When client1 receives a message on that topic it simply publishes the content on a retained “reported” topic, which client2 subscribes to and sets an internal attribute. The qos for all subscriptions are set to 1 to ensure receiving the messages.
To reproduce the behavior, we start all components on the different devices (I use my local machine and a raspberrypi). Now we perform the following steps:

  1. disconnect the internet connection to disconnect the broker bridge.
  2. publish some messages from client2 on the “desired” topic
  3. reconnect to the internet and wait for the bridge to reconnect

I assume client1 receives all published messages while not connected and publishes the messages on “reported” in the order they are published by client2. On the other hand, I’m expecting client2 to receive all messages, resulting in the internal attribute being set to the latest message sent by client2 itself.
For instance:

  1. start:
    • client2 publish “initial” → client1 publishes retained “initial” → client2 sets internal attribute to “initial”
  2. disconnect
  3. client2 publish “foo” → internal attribute = “initial”
  4. client2 publish “bar” → internal attribute = “initial”
  5. reconnect
  6. client1 receives “foo”, publishes “foo”
  7. client1 receives “bar”, publishes “bar”
  8. client2 receives “foo” → internal attribute = “foo”
  9. client2 receives “bar” → internal attribute = “bar”

I observe that client2 also receives the retained message “initial” before step 8, which is okay, but also after step 9. So the newly sent messages of client1 are overridden and the currently held retained message in the broker is “initial”, not “bar”.
Any ideas why this happens?

Configuration

  • I provide two simple Python scripts with paho-mqtt 1.6.1 for client 1 +2
    • you may have to replace/set the environment variables to connect to your brokers
    • clients.zip (1015 Bytes)
  • mosquitto version 2.0.18
  • mosquitto bridge configuration:
topic # both 1
remote_clientid some_id


start_type automatic
try_private true
bridge_protocol_version mqttv50
cleansession false

Hi,

So you have client 1 connected to the Pi, client 2 connected to your desktop, and either the Pi connecting to the desktop as a bridge, or the desktop connecting to the Pi as a bridge.

You publish a message from client 2, wait for the response from client 1 to get back to client 2, then break the connection between the Pi and the desktop.

When the connection is down, you’ve published more messages from client2 but because the bridge is down client 1 doesn’t receive those messages, can’t make a response and hence the internal attribute of client 2 remains as “initial”.

When the connection is restored, bad stuff happens. This is all about the connection between the two brokers. I’m assuming the desktop is the one creating the bridge.

  1. Desktop sends CONNECT
  2. Pi replies with CONNACK
  3. Desktop sends SUBSCRIBE for incoming topics
  4. Desktop queues up retained messages to the Pi for outgoing topics (including the “initial” state)
  5. Pi sends SUBACK
  6. Pi sends queued messages (“foo”, “bar”) to Desktop
  7. Pi sends retained messages to bridge to comply with the new subscription
  8. Pi receives messages from step 4 - “initial”
  9. Desktop sends “foo”, “bar” to client 2
  10. Desktop sends “initial” to client 2

So what is happening is that the bridge client on the Pi has messages queued for it by client 2. When the connection is recreated by the desktop, it always makes the subscription request again. This includes the existing retained messages. We have to preserve order, so the retained messages are added to the bridge client queue after the queued messages, and so we end up with foo, bar, initial.

I’m not sure what the best solution is at the moment.

Regards,

Roger

Hi Rodger,

thanks for your reply. Your assumption is right, the desktop’s broker is creating the bridge to the pi’s broker, and also your reconstruction of the scenario is exactly what I tried to describe earlier.

From my point of view, this shouldn’t happen and is a bug. The only solution I can think of is a workaround by checking the messages after a broker bridge reconnect but it’s some kind of ugly and complicated.

Regards,
Tobias