Broker connection times out

I recompiled a few year old project for my ESP8266 and it is not able to connect to the MQTT broker. Credentials an everything else is correct, I have lot of devices already on the broker.
When the ESP is trying to connect to the broker the ESP is rebooted by the watchdog timer. I found some note to reduce the MQTT_SOCKET_TIMEOUT from 15 to 1, in which case the connection attempt failed with rc=-4.

I must be missing something something obvious, as I am only finding references to posts to this many years ago.

I am using Arduni IDE 2.3.4 board manager and the libraries have been all updated to the latest version.

Any ideas what to check next?

Hi csongor.varga,

A good way to look into this closer would be to check the Mosquitto logs for what is happening on the broker the moment you try to connect your client.

For this, go into mosquitto.conf and add log_type all. Restart the broker and check the logs for what is prompted at the time of the connection.

If that does not help, please post those here in the thread! :slight_smile:
Also more details around your client connections would help and the contents of your mosquitto.conf file.

Thanks for the feedback. Did a bit more investigation.

If I run the example mMQTT sketch (mqtt_esp8266), it works fine, connect to the cloud server but to my local server as well. But my more “advanced” sketch fails.

I have a 1.5 year old laptop that I retired but still works. There is where this sketch (which is not working) was originally developped. It has 1.8.4 Ardunio IDE, but also the 2.8 version of PubSubClient and after recompiling the exact same project there, it connects to the MQTT server.

I changed the log level as suggested and see the following in the logs:
This is where the ESP tries to connect (local ip is 192.168.1.129 - DHCP).

1738009268: Sending PUBLISH to lightstation (d0, q0, r0, m0, 'lightstation/status', ... (41 bytes))
1738009268: Sending PUBLISH to nodered_7ed53f54613414e6 (d0, q0, r0, m0, 'lightstation/status', ... (41 bytes))
1738009268: New connection from 192.168.1.129:57623 on port 1883.
1738009268: Received PUBLISH from SHP2 (d0, q0, r0, m0, 'stat/shp2/STATUS8', ... (233 bytes))
1738009268: Sending PUBLISH to node-red (d0, q0, r0, m0, 'stat/shp2/STATUS8', ... (233 bytes))

After that I am not seeing any log entries to this client or IP. And finally:

1738009271: Sending PINGRESP to growatt-224fdc
1738009271: Received PINGREQ from wordclock
1738009271: Sending PINGRESP to wordclock
1738009272: Client <unknown> has exceeded timeout, disconnecting.
1738009272: Received PUBLISH from ESPClient1 (d0, q0, r0, m0, '/Sonoff1/Uptime/', ... (3 bytes))
1738009272: Sending PUBLISH to node-red (d0, q0, r0, m0, '/Sonoff1/Uptime/', ... (3 bytes))
1738009272: Sending PUBLISH to ESPClient1 (d0, q0, r0, m0, '/Sonoff1/Uptime/', ... (3 bytes))

After deleting the log_type all, I see the same in the logs:

1738009665: New client connected from 127.0.0.1:43260 as NodeRed (p2, c1, k60, u'xxx').
1738009665: New client connected from 192.168.1.80:58562 as node-red (p2, c1, k60, u'xxx').
1738009670: New connection from 192.168.1.160:63340 on port 1883.
1738009670: New client connected from 192.168.1.160:63340 as ESP32_38DA0C (p2, c1, k120, u'xxx').
1738009671: New connection from 192.168.1.129:51884 on port 1883.
1738009677: New connection from 192.168.1.129:63062 on port 1883.
1738009684: New connection from 192.168.1.129:63402 on port 1883.
1738009690: New connection from 192.168.1.129:53412 on port 1883.
1738009697: New connection from 192.168.1.129:63693 on port 1883.
1738009704: New connection from 192.168.1.129:61464 on port 1883.
1738009710: New connection from 192.168.1.129:62423 on port 1883.
1738009711: New client connected from 192.168.1.144:52480 as shellymotion2-84FD276EDB66 (p2, c1, k7200, u'xxx').
1738009717: New connection from 192.168.1.129:51517 on port 1883.
1738009724: New connection from 192.168.1.129:62967 on port 1883.
1738009730: New connection from 192.168.1.129:53203 on port 1883.
1738009737: New connection from 192.168.1.129:58644 on port 1883.
1738009743: New connection from 192.168.1.129:55425 on port 1883.
1738009746: Client <unknown> has exceeded timeout, disconnecting.
1738009750: New connection from 192.168.1.129:56321 on port 1883.
1738009752: Client <unknown> has exceeded timeout, disconnecting.
1738009757: New connection from 192.168.1.129:57130 on port 1883.
1738009758: Client <unknown> has exceeded timeout, disconnecting.
1738009763: New connection from 192.168.1.129:57904 on port 1883.
1738009764: Client <unknown> has exceeded timeout, disconnecting.
1738009770: Client <unknown> has exceeded timeout, disconnecting.
1738009770: New connection from 192.168.1.129:49630 on port 1883.
1738009776: Client <unknown> has exceeded timeout, disconnecting.
1738009776: New connection from 192.168.1.129:60282 on port 1883.
1738009782: Client <unknown> has exceeded timeout, disconnecting.
1738009783: New connection from 192.168.1.129:61582 on port 1883.

The MQTT part in my setup looks like this:

  // Set up the MQTT server connection
  if (mqtt_server!="") {
    mqtt.setServer(mqtt_server, 1883);
    mqtt.setBufferSize(1024);
    mqtt.setCallback(callback);
  }

I am using the same concept in my code as in the example. The main loop check if we are connected to the MQTT server and if not, it calls the .connect method:

// MQTT reconnect logic
void reconnect() {
  //String mytopic;
  // Loop until we're reconnected
  while (!mqtt.connected()) {
    Serial.print("Attempting MQTT connection...");
    byte mac[6];                     // the MAC address of your Wifi shield
    WiFi.macAddress(mac);
    sprintf(newclientid,"%s-%02x%02x%02x",clientID,mac[2],mac[1],mac[0]);
    Serial.print(F("Client ID: "));
    Serial.println(newclientid);
    // Attempt to connect
    if (mqtt.connect(newclientid, mqtt_user, mqtt_password)) {
      Serial.println(F("connected"));
      // ... and resubscribe
      char topic[80];
      sprintf(topic,"%swrite/#",topicRoot);
      mqtt.subscribe(topic);
    } else {
      Serial.print(F("failed, rc="));
      Serial.print(mqtt.state());
      Serial.println(F(" try again in 5 seconds"));
      // Wait 5 seconds before retrying
      delay(5000);
    }
  }
}

And this is what I get in the serial monitor:

sdm120 Solar Inverter to MQTT Gateway
Connecting to Wifi.......
Connected to wifi network
IP address: 192.168.1.129
Signal [RSSI]: -70
Modbus connection is set up
HTTP server started
Attempting MQTT connection...Client ID: abcd-05613c

--------------- CUT HERE FOR EXCEPTION DECODER ---------------

Soft WDT reset
...

I have an os_timer set to every second. I commented that out but issue remains.
The sketch is custom modbus serial to MQTT converter. It reads a sensor via modbus using the ModbusMaster lib, and also SoftwareSerial. But the entire serial read process is only enabled if we are connected to MQTT. Connecting to the MQTT server is the first check in the main loop, that executed right after setup().

Hi,

If I understand this correctly the client is crashing in the WDT some time after making the Serial.println(newclientid); call.

The code around that area looks like this:

    Serial.println(newclientid);
    // Attempt to connect
    if (mqtt.connect(newclientid, mqtt_user, mqtt_password)) {
      Serial.println(F("connected"));
      ...
    } else {
      Serial.print(F("failed, rc="));

Whether the call to mqtt.connect() succeeds or fails, the next line is to print to serial again. Given that your log doesn’t include either of those lines then it must be getting stuck in the mqtt.connect() call.

In the Mosquitto log you will see this sort of entry when the socket is first opened:

1738009670: New connection from 192.168.1.160:63340 on port 1883.

And this sort of entry when the broker has received the CONNECT packet from the client:

1738009670: New client connected from 192.168.1.160:63340 as ESP32_38DA0C (p2, c1, k120, u'xxx').

We can hence see from the Mosquitto log that the broker isn’t receiving a CONNECT packet from your client.

I’m not really sure what to say at this point. Have you tried working from a completely clean example to see if that works, then gradually making changes until it works as you want?

Regards,

Roger