Skip to content

MQTT connection failures with Mosquitto #7648

@tkerby

Description

@tkerby

I'm seeing issues with communication between telegraf and mosquito in a docker setup where the connection fails and does not recover. This may be related to issue #4594 as there is similar behaviour although I believe that was resolved.

Relevant telegraf.conf:

# # Read metrics from MQTT topic(s)
[[inputs.mqtt_consumer]]
  name_override = "mqtt_home"

  ## MQTT broker URLs to be used. The format should be scheme://host:port,
  ## schema can be tcp, ssl, or ws.
  servers = ["tcp://mosquitto:1883"]

  ## Topics that will be subscribed to.
  topics = ["home/devices/+/+/up"]

  ## The message topic will be stored in a tag specified by this value.  If set
  ## to the empty string no topic tag will be created.
  # topic_tag = "topic"

  ## QoS policy for messages
  ##   0 = at most once
  ##   1 = at least once
  ##   2 = exactly once
  ##
  ## When using a QoS of 1 or 2, you should enable persistent_session to allow
  ## resuming unacknowledged messages.
  qos = 0

  ## Connection timeout for initial connection in seconds
  connection_timeout = "60s"

  ## Maximum messages to read from the broker that have not been written by an
  ## output.  For best throughput set based on the number of metrics within
  ## each message and the size of the output's metric_batch_size.
  ##
  ## For example, if each message from the queue contains 10 metrics and the
  ## output metric_batch_size is 1000, setting this to 100 will ensure that a
  ## full batch is collected and the write is triggered immediately without
  ## waiting until the next flush_interval.
  max_undelivered_messages = 1

  ## Persistent session disables clearing of the client session on connection.
  ## In order for this option to work you must also set client_id to identity
  ## the client.  To receive messages that arrived while the client is offline,
  ## also set the qos option to 1 or 2 and don't forget to also set the QoS when
  ## publishing.
  persistent_session = false

  ## If unset, a random client ID will be generated.
  client_id = "telegraf1"

  ## Username and password to connect MQTT server.
  username = "*******"
  password = "*******"

  ## Optional TLS Config
  # tls_ca = "/etc/telegraf/ca.pem"
  # tls_cert = "/etc/telegraf/cert.pem"
  # tls_key = "/etc/telegraf/key.pem"
  ## Use TLS but skip chain & host verification
  # insecure_skip_verify = false

  ## Data format to consume.
  ## Each data format has its own unique set of configuration options, read
  ## more about them here:
  ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
  data_format = "json"
  interval = "20s"

System info:

Telegraf 1.14.3 with mosquito 1.6.10

Docker

version: '3.8'
services:
    node-red:
        container_name: node-red
        restart: unless-stopped
        image: 'nodered/node-red:latest'
        environment:
            - TZ=Europe/London
        ports:
            - '1880:1880'
        networks:
            - iot-net
        volumes:
            - /mnt/SSD/docker/node-red/data:/data
        user: root:root
    mosquitto:
        container_name: mosquitto
        restart: unless-stopped
        image: eclipse-mosquitto
        environment:
            - TZ=Europe/London
        ports:
            - '1883:1883'
            - '9001:9001'
        networks:
            - iot-net
        volumes:
            - /mnt/SSD/docker/mosquitto/config:/mosquitto/config
            - /mnt/SSD/docker/mosquitto/data:/mosquitto/data
            - /mnt/SSD/docker/mosquitto/log:/mosquitto/log
    influxdb:
        container_name: influxdb
        restart: unless-stopped
        image: influxdb
        environment:
            - TZ=Europe/London
            - INFLUXDB_HTTP_AUTH_ENABLED=true
        networks:
            - iot-net
        volumes:
            - /mnt/SSD/docker/influxdb/data:/var/lib/influxdb
        ports:
            # The API for InfluxDB is served on port 8086
            - "8082:8082"
            - "8086:8086"
            # UDP Port
            - "8089:8089"
        privileged: true
    telegraf:
        container_name: telegraf
        restart: unless-stopped
        image: telegraf
        networks:
            - iot-net
        environment:
            - interval=60s
            - flush_interval=60s
        depends_on:
            - influxdb
            - mosquitto
        privileged: true
        ports:
            - 8125:8125
            - 8092:8092
            - 8094:8094
        volumes:
            - /mnt/SSD/docker/telegraf/config:/etc/telegraf
            - /var/run/docker.sock:/var/run/docker.sock
    grafana:
        container_name: grafana
        restart: unless-stopped
        image: grafana/grafana
        environment:
            - TZ=Europe/London
        depends_on:
            - influxdb
        networks:
            - iot-net
        volumes:
            - /mnt/SSD/docker/grafana/data:/var/lib/grafana
            - /mnt/SSD/docker/grafana/config/grafana.ini:/etc/grafana/grafana.ini      
        ports:
            - '3000:3000'
    nginx:
        container_name: nginx
        restart: unless-stopped
        image: nginx
        depends_on:
            - grafana
            - node-red
        environment:
            - TZ=Europe/London
        ports:
            - '80:80'
            - '443:443'
        networks:
            - iot-net
        volumes:
            - /mnt/SSD/docker/nginx/config:/etc/nginx
            - /mnt/SSD/docker/nginx/certs:/etc/ssl/private



networks:
    iot-net: null

Steps to reproduce:

Mosquitto setup with basic password authentication. Connected ok for the first 10 hours or so then failed. No config changes at the time. Fails on restart even when cleaned with docker-compose down.

I'm pulling data from the local mosquito broker.

Problem seems to be related to pingresp - see an error on both sides

Expected behavior:

Expect to see Telegraf pulling the mqqt data into influxdb. Sensor data is approx every 15 seconds

Actual behavior:

No data transferred to influxdb. Repeated reconnects

Additional info:

Can subscribe to the topic from a desktop without an issue:
C:\Program Files\mosquitto>mosquitto_sub -h grafana.local -t home/devices/+/+/up -u *****-P ***** -v
home/devices/garage/beerfridge/up {"temperature_fridge_air_c":4.4375}
home/devices/garage/beerfridge/up {"temperature_fridge_air_c":4.375}

I'm also ok pulling mqtt data from two other remote sources (Things network).

I've tried

  • making the connection persistent and changing QOS settings
  • changing timings
  • flushing the mosquito database

Mosquitto log:

2020-06-08T12:26:48: mosquitto version 1.6.10 starting
2020-06-08T12:26:48: Config loaded from /mosquitto/config/mosquitto.conf.
2020-06-08T12:26:48: Opening ipv4 listen socket on port 1883.
2020-06-08T12:26:48: Opening ipv6 listen socket on port 1883.
2020-06-08T12:26:51: New connection from 192.168.1.238 on port 1883.
2020-06-08T12:26:51: New client connected from 192.168.1.238 as beerfridge (p2, c1, k15, u'sensors').
2020-06-08T12:26:53: New connection from 172.18.0.6 on port 1883.
2020-06-08T12:26:53: New client connected from 172.18.0.6 as telegraf1 (p2, c1, k60, u'sensors').
2020-06-08T12:28:13: Socket error on client telegraf1, disconnecting.
2020-06-08T12:28:20: New connection from 172.18.0.6 on port 1883.
2020-06-08T12:28:20: New client connected from 172.18.0.6 as telegraf1 (p2, c1, k60, u'sensors').
2020-06-08T12:29:35: Socket error on client telegraf1, disconnecting.

Telegraf log:

2020-06-08T12:26:53Z I! Starting Telegraf 1.14.3
2020-06-08T12:26:53Z I! Using config file: /etc/telegraf/telegraf.conf
2020-06-08T12:26:53Z I! Loaded inputs: cpu processes swap docker openweathermap disk diskio kernel mem system mqtt_consumer mqtt_consumer mqtt_consumer
2020-06-08T12:26:53Z I! Loaded aggregators:
2020-06-08T12:26:53Z I! Loaded processors: regex
2020-06-08T12:26:53Z I! Loaded outputs: influxdb
2020-06-08T12:26:53Z I! Tags enabled: host=57ce4e91341b
2020-06-08T12:26:53Z I! [agent] Config: Interval:1m0s, Quiet:false, Hostname:"57ce4e91341b", Flush Interval:1m0s
2020-06-08T12:26:53Z I! [inputs.mqtt_consumer] Connected [tcp://eu.thethings.network:1883]
2020-06-08T12:26:53Z I! [inputs.mqtt_consumer] Connected [tcp://eu.thethings.network:1883]
2020-06-08T12:26:53Z I! [inputs.mqtt_consumer] Connected [tcp://mosquitto:1883]
2020-06-08T12:28:13Z E! [inputs.mqtt_consumer] Error in plugin: connection lost: pingresp not received, disconnecting
2020-06-08T12:28:20Z I! [inputs.mqtt_consumer] Connected [tcp://mosquitto:1883]
2020-06-08T12:29:35Z E! [inputs.mqtt_consumer] Error in plugin: connection lost: pingresp not received, disconnecting
2020-06-08T12:29:40Z I! [inputs.mqtt_consumer] Connected [tcp://mosquitto:1883]
2020-06-08T12:31:00Z E! [inputs.mqtt_consumer] Error in plugin: connection lost: pingresp not received, disconnecting
2020-06-08T12:31:20Z I! [inputs.mqtt_consumer] Connected [tcp://mosquitto:1883]
2020-06-08T12:32:40Z E! [inputs.mqtt_consumer] Error in plugin: connection lost: pingresp not received, disconnecting
2020-06-08T12:33:00Z I! [inputs.mqtt_consumer] Connected [tcp://mosquitto:1883]
2020-06-08T12:34:20Z E! [inputs.mqtt_consumer] Error in plugin: connection lost: pingresp not received, disconnecting
2020-06-08T12:34:40Z I! [inputs.mqtt_consumer] Connected [tcp://mosquitto:1883]
2020-06-08T12:36:00Z E! [inputs.mqtt_consumer] Error in plugin: connection lost: pingresp not received, disconnecting
2020-06-08T12:36:20Z I! [inputs.mqtt_consumer] Connected [tcp://mosquitto:1883]
2020-06-08T12:37:40Z E! [inputs.mqtt_consumer] Error in plugin: connection lost: pingresp not received, disconnecting
2020-06-08T12:38:00Z I! [inputs.mqtt_consumer] Connected [tcp://mosquitto:1883]
2020-06-08T12:39:20Z E! [inputs.mqtt_consumer] Error in plugin: connection lost: pingresp not received, disconnecting

Metadata

Metadata

Assignees

Labels

area/mqttbugunexpected problem or unintended behavior

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions