Skip to content

flaky unit test: bridge_reconnects_successfully_after_local_connection_interrupted #3021

@reubenmiller

Description

@reubenmiller

Describe the bug

Flaky unit tests:

Though a workaround has been added in a PR, which adds a sleep before interrupting the connection.

Fail

Below shows an example when the test fails.

    Finished `test` profile [unoptimized + debuginfo] target(s) in 0.50s
    Starting 1 test across 2 binaries (52 skipped; run ID: 22061f7c-17a5-42de-a646-e3e1ac124df8, nextest profile: default)
       START             tedge_mqtt_bridge::bridge bridge_reconnects_successfully_after_local_connection_interrupted

running 1 test
TEST: local_broker_port: 59103
TEST: cloud_broker_port: 59104
[2024-07-29T11:17:56Z INFO  tedge_mqtt_bridge::health] MQTT bridge connected to cloud broker
[2024-07-29T11:17:56Z INFO  tedge_mqtt_bridge] Bridge cloud connection "cloud" subscribing to [Filter = s/ds, Qos = AtLeastOnce]
[2024-07-29T11:17:56Z INFO  tedge_mqtt_bridge::health] MQTT bridge connected to local broker
[2024-07-29T11:17:56Z INFO  tedge_mqtt_bridge] Bridge cloud connection "local" subscribing to [Filter = c8y/s/us, Qos = AtLeastOnce]
[2024-07-29T11:17:56Z ERROR tedge_mqtt_bridge::health] MQTT bridge failed to connect to local broker: Mqtt state: Io error: Custom { kind: ConnectionAborted, error: "connection closed by peer" }
[2024-07-29T11:17:56Z ERROR tedge_mqtt_bridge::health] MQTT bridge failed to connect to local broker: I/O: connection closed by peer
[2024-07-29T11:17:56Z INFO  tedge_mqtt_bridge::health] MQTT bridge connected to local broker
[2024-07-29T11:17:56Z INFO  tedge_mqtt_bridge] Bridge cloud connection "local" subscribing to [Filter = c8y/s/us, Qos = AtLeastOnce]
thread 'bridge_reconnects_successfully_after_local_connection_interrupted' panicked at crates/extensions/tedge_mqtt_bridge/tests/bridge.rs:241:58:
called `Result::unwrap()` on an `Err` value: timed-out waiting for received message

Caused by:
    deadline has elapsed
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
test bridge_reconnects_successfully_after_local_connection_interrupted ... FAILED

failures:

failures:
    bridge_reconnects_successfully_after_local_connection_interrupted

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 3 filtered out; finished in 5.06s

        FAIL [   5.064s] tedge_mqtt_bridge::bridge bridge_reconnects_successfully_after_local_connection_interrupted
   Canceling due to test failure
------------
     Summary [   5.065s] 1 test run: 0 passed, 1 failed, 52 skipped
        FAIL [   5.064s] tedge_mqtt_bridge::bridge bridge_reconnects_successfully_after_local_connection_interrupted
error: test run failed

Pass

Below shows an example when the test passes.

    Finished `test` profile [unoptimized + debuginfo] target(s) in 0.47s
    Starting 1 test across 2 binaries (52 skipped; run ID: 8c895c93-30fb-4ea6-98c1-2dc2cd7d9299, nextest profile: default)
       START             tedge_mqtt_bridge::bridge bridge_reconnects_successfully_after_local_connection_interrupted

running 1 test
TEST: local_broker_port: 59091
TEST: cloud_broker_port: 59092
[2024-07-29T11:17:55Z INFO  tedge_mqtt_bridge::health] MQTT bridge connected to cloud broker
[2024-07-29T11:17:55Z INFO  tedge_mqtt_bridge] Bridge cloud connection "cloud" subscribing to [Filter = s/ds, Qos = AtLeastOnce]
[2024-07-29T11:17:55Z INFO  tedge_mqtt_bridge::health] MQTT bridge connected to local broker
[2024-07-29T11:17:55Z INFO  tedge_mqtt_bridge] Bridge cloud connection "local" subscribing to [Filter = c8y/s/us, Qos = AtLeastOnce]
[2024-07-29T11:17:55Z ERROR tedge_mqtt_bridge::health] MQTT bridge failed to connect to local broker: Mqtt state: Io error: Custom { kind: ConnectionAborted, error: "connection closed by peer" }
[2024-07-29T11:17:55Z ERROR tedge_mqtt_bridge::health] MQTT bridge failed to connect to local broker: I/O: Connection reset by peer (os error 54)
[2024-07-29T11:17:55Z INFO  tedge_mqtt_bridge::health] MQTT bridge connected to local broker
[2024-07-29T11:17:55Z INFO  tedge_mqtt_bridge] Bridge cloud connection "local" subscribing to [Filter = c8y/s/us, Qos = AtLeastOnce]
test bridge_reconnects_successfully_after_local_connection_interrupted ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 3 filtered out; finished in 0.05s

        PASS [   0.056s] tedge_mqtt_bridge::bridge bridge_reconnects_successfully_after_local_connection_interrupted
------------
     Summary [   0.056s] 1 test run: 1 passed, 52 skipped

Notes

  • Use a fixed ports does not seem to help (so that should rule out a race condition between getting a "free port" and using it)

To Reproduce

A flaky test can be reproduced by re-running the tests until it fails. Sometimes it takes more than ~30 attempts

RUNS=0; while cargo nextest run -p tedge_mqtt_bridge "bridge_reconnects_successfully_after_local_connection_interrupted" --nocapture; do RUNS=$((RUNS+1)) ; echo repeating test run $RUNS; done; echo "Failed after $RUNS runs"

Expected behavior

The test should not fail sporadically.

Screenshots

Environment (please complete the following information):

  • OS [incl. version]: MacOS and Ubuntu 22.04
  • Hardware [incl. revision]: Macbook Pro
  • System-Architecture [e.g. result of "uname -a"]: aarch64 and x86_64
  • thin-edge.io version [e.g. 0.1.0]: 1.1.2~306+gfae4a56

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingflaky testLabel given to flaky or otherwise faulty tests

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions