Skip to content

c8y proxy fails with sporadic dns lookup address errors #2321

@reubenmiller

Description

@reubenmiller

Describe the bug

Requests to thin-edge.io Cumulocity IoT proxy (localhost:8001) returns sporadically with a 502 Bad Gateway status code.

The tedge-mapper-c8y which is providing the local Cumulocity IoT proxy logs the following error on a failed proxy attempt:

Oct 06 11:32:26 pippin tedge-mapper[459]: 2023-10-06T01:32:26.117459217Z ERROR c8y_auth_proxy::server:   × making proxied request to https://test-ci-runner01.latest.stage.c8y.io/
Oct 06 11:32:26 pippin tedge-mapper[459]:   │ event/events/3767200/binaries
Oct 06 11:32:26 pippin tedge-mapper[459]:   ├─▶ error sending request for url (https://test-ci-
Oct 06 11:32:26 pippin tedge-mapper[459]:   │   runner01.latest.stage.c8y.io/event/events/3767200/binaries): error
Oct 06 11:32:26 pippin tedge-mapper[459]:   │   trying to connect: dns error: failed to lookup address information:
Oct 06 11:32:26 pippin tedge-mapper[459]:   │   Try again
Oct 06 11:32:26 pippin tedge-mapper[459]:   ├─▶ error trying to connect: dns error: failed to lookup address
Oct 06 11:32:26 pippin tedge-mapper[459]:   │   information: Try again
Oct 06 11:32:26 pippin tedge-mapper[459]:   ├─▶ dns error: failed to lookup address information: Try again
Oct 06 11:32:26 pippin tedge-mapper[459]:   ╰─▶ failed to lookup address information: Try again

However using curl to communicate directly with the Cumulocity IoT tenant is able to successfully communicate with the tenant. A script was written to compare requests against the local proxy and direct communication with the Cumulocity IoT Tenant:

In combination with the duration of the failed requests and the error message printed in the log, there is a strong correlation with the duration of the request and the failure rate. It seems that if the DNS resolver takes longer than 5 seconds, then proxied request fails.

$ ./check_proxy.sh
------- Attempt 1 (12:09:47) -------
proxy	PASSED	2s
curl	PASSED	1s
------- Attempt 2 (12:09:55) -------
proxy	PASSED	2s
curl	PASSED	6s
------- Attempt 3 (12:10:08) -------
proxy	PASSED	2s
curl	PASSED	1s
------- Attempt 4 (12:10:16) -------
proxy	PASSED	2s
curl	PASSED	6s
------- Attempt 5 (12:10:29) -------
proxy	FAILED	5s
curl	PASSED	1s
------- Attempt 6 (12:10:40) -------
proxy	FAILED	5s
curl	PASSED	1s
------- Attempt 7 (12:10:51) -------
proxy	PASSED	2s
curl	PASSED	1s
------- Attempt 8 (12:10:59) -------
proxy	FAILED	5s
curl	PASSED	1s
------- Attempt 9 (12:11:10) -------
proxy	FAILED	5s
curl	PASSED	7s
------- Attempt 10 (12:11:27) -------
proxy	PASSED	1s
curl	PASSED	1s
-------------------------
------- Summary ---------
Attempts: 10
proxy failures: 4
curl failures: 0

To Reproduce

  1. Install thin-edge.io and connect it to a Cumulocity IoT tenant

  2. Add a non-existent IP address as a nameserver to the /etc/resolv.conf file. There must still be some valid nameservers in the list, and the non-existent IP address should still be a valid ip address.

    For example, 192.168.213.213 is not reachable (tested via ping 192.168.213.213):

    nameserver 192.168.213.213
    nameserver fd00::3e37:12ff:fe83:2a0d
    nameserver 2001:44b8:2167:a100:3e37:12ff:fe83:2a0d
    
  3. Copy the "check_proxy.sh" script (from the "Additional Context" section)

  4. Execute the check_proxy.sh script

    chmod +x check_proxy.sh
    ./check_proxy.sh
    

    It is expected that the proxy failures should be zero (assuming there is no server communication issues)

Expected behavior

The proxy should be able to handle DNS resolution failures due to misconfiguration of the nameserver (e.g. inside the /etc/resolv.conf. As long as there are some valid DNS nameservers configured, then the requests should not fail.

Screenshots

Environment (please complete the following information):

Property Value
OS [incl. version] Debian GNU/Linux 11 (bullseye)
Hardware [incl. revision] Raspberry Pi Zero 2 W Rev 1.0
System-Architecture Linux pippin 6.1.21-v8+ #1642 SMP PREEMPT Mon Apr 3 17:24:16 BST 2023 aarch64 GNU/Linux
thin-edge.io version tedge 0.12.1~365+g5a52630

Additional context

This section contains the log output of both the test

check_proxy.sh output

$ ./check_proxy.sh
------- Attempt 1 (12:09:47) -------
proxy	PASSED	2s
curl	PASSED	1s
------- Attempt 2 (12:09:55) -------
proxy	PASSED	2s
curl	PASSED	6s
------- Attempt 3 (12:10:08) -------
proxy	PASSED	2s
curl	PASSED	1s
------- Attempt 4 (12:10:16) -------
proxy	PASSED	2s
curl	PASSED	6s
------- Attempt 5 (12:10:29) -------
proxy	FAILED	5s
curl	PASSED	1s
------- Attempt 6 (12:10:40) -------
proxy	FAILED	5s
curl	PASSED	1s
------- Attempt 7 (12:10:51) -------
proxy	PASSED	2s
curl	PASSED	1s
------- Attempt 8 (12:10:59) -------
proxy	FAILED	5s
curl	PASSED	1s
------- Attempt 9 (12:11:10) -------
proxy	FAILED	5s
curl	PASSED	7s
------- Attempt 10 (12:11:27) -------
proxy	PASSED	1s
curl	PASSED	1s
-------------------------
------- Summary ---------
Attempts: 10
proxy failures: 4
curl failures: 0

Note, the check_proxy.sh script uses two slightly different urls in the comparison between the proxy and direction HTTP call. The proxy uses the /tenant/currentTenant whilst the direct HTTP call uses /tenant/loginOptions. This is because the later does not require an authorization header reducing the need to fetch a token first. Unfortunately the /tenant/loginOptions cannot be used in the proxy case as this endpoint fails when the authorization header is provided (which the proxy adds by default). However the two URLs are still deemed to be similar enough, and both require the same host name to be resolved (since there is a high likelihood that this is a DNS issue and not an authorization issue).

tedge-mapper-c8y logs

$ journalctl -fu tedge-mapper-c8y -n 0
-- Journal begins at Wed 2023-05-03 13:11:24 AEST. --
Oct 06 12:10:34 pippin tedge-mapper[459]: 2023-10-06T02:10:34.388008161Z ERROR c8y_auth_proxy::server:   × making proxied request to https://test-ci-runner01.latest.stage.c8y.io/
Oct 06 12:10:34 pippin tedge-mapper[459]:   │ tenant/currentTenant
Oct 06 12:10:34 pippin tedge-mapper[459]:   ├─▶ error sending request for url (https://test-ci-
Oct 06 12:10:34 pippin tedge-mapper[459]:   │   runner01.latest.stage.c8y.io/tenant/currentTenant): error trying to
Oct 06 12:10:34 pippin tedge-mapper[459]:   │   connect: dns error: failed to lookup address information: Try again
Oct 06 12:10:34 pippin tedge-mapper[459]:   ├─▶ error trying to connect: dns error: failed to lookup address
Oct 06 12:10:34 pippin tedge-mapper[459]:   │   information: Try again
Oct 06 12:10:34 pippin tedge-mapper[459]:   ├─▶ dns error: failed to lookup address information: Try again
Oct 06 12:10:34 pippin tedge-mapper[459]:   ╰─▶ failed to lookup address information: Try again
Oct 06 12:10:45 pippin tedge-mapper[459]: 2023-10-06T02:10:45.632226176Z ERROR c8y_auth_proxy::server:   × making proxied request to https://test-ci-runner01.latest.stage.c8y.io/
Oct 06 12:10:45 pippin tedge-mapper[459]:   │ tenant/currentTenant
Oct 06 12:10:45 pippin tedge-mapper[459]:   ├─▶ error sending request for url (https://test-ci-
Oct 06 12:10:45 pippin tedge-mapper[459]:   │   runner01.latest.stage.c8y.io/tenant/currentTenant): error trying to
Oct 06 12:10:45 pippin tedge-mapper[459]:   │   connect: dns error: failed to lookup address information: Try again
Oct 06 12:10:45 pippin tedge-mapper[459]:   ├─▶ error trying to connect: dns error: failed to lookup address
Oct 06 12:10:45 pippin tedge-mapper[459]:   │   information: Try again
Oct 06 12:10:45 pippin tedge-mapper[459]:   ├─▶ dns error: failed to lookup address information: Try again
Oct 06 12:10:45 pippin tedge-mapper[459]:   ╰─▶ failed to lookup address information: Try again
Oct 06 12:11:04 pippin tedge-mapper[459]: 2023-10-06T02:11:04.6176006Z ERROR c8y_auth_proxy::server:   × making proxied request to https://test-ci-runner01.latest.stage.c8y.io/
Oct 06 12:11:04 pippin tedge-mapper[459]:   │ tenant/currentTenant
Oct 06 12:11:04 pippin tedge-mapper[459]:   ├─▶ error sending request for url (https://test-ci-
Oct 06 12:11:04 pippin tedge-mapper[459]:   │   runner01.latest.stage.c8y.io/tenant/currentTenant): error trying to
Oct 06 12:11:04 pippin tedge-mapper[459]:   │   connect: dns error: failed to lookup address information: Try again
Oct 06 12:11:04 pippin tedge-mapper[459]:   ├─▶ error trying to connect: dns error: failed to lookup address
Oct 06 12:11:04 pippin tedge-mapper[459]:   │   information: Try again
Oct 06 12:11:04 pippin tedge-mapper[459]:   ├─▶ dns error: failed to lookup address information: Try again
Oct 06 12:11:04 pippin tedge-mapper[459]:   ╰─▶ failed to lookup address information: Try again
Oct 06 12:11:15 pippin tedge-mapper[459]: 2023-10-06T02:11:15.844906908Z ERROR c8y_auth_proxy::server:   × making proxied request to https://test-ci-runner01.latest.stage.c8y.io/
Oct 06 12:11:15 pippin tedge-mapper[459]:   │ tenant/currentTenant
Oct 06 12:11:15 pippin tedge-mapper[459]:   ├─▶ error sending request for url (https://test-ci-
Oct 06 12:11:15 pippin tedge-mapper[459]:   │   runner01.latest.stage.c8y.io/tenant/currentTenant): error trying to
Oct 06 12:11:15 pippin tedge-mapper[459]:   │   connect: dns error: failed to lookup address information: Try again
Oct 06 12:11:15 pippin tedge-mapper[459]:   ├─▶ error trying to connect: dns error: failed to lookup address
Oct 06 12:11:15 pippin tedge-mapper[459]:   │   information: Try again
Oct 06 12:11:15 pippin tedge-mapper[459]:   ├─▶ dns error: failed to lookup address information: Try again
Oct 06 12:11:15 pippin tedge-mapper[459]:   ╰─▶ failed to lookup address information: Try again

file: check_proxy.sh

#!/bin/bash

set -e

PROXY_FAILURES=0
CURL_FAILURES=0

C8Y_URL=$(tedge config get c8y.url)

check_url() {
    attempt=$1
    printf -- '------- Attempt %s (%s) -------\n' "$attempt" "$(date +%H:%M:%S)"
    start=$(date +%s)
    RESULT="PASSED"
    # if ! curl -sf http://localhost:8001/c8y/event/events/3767200/binaries >/dev/null; then
    if ! curl -sf http://localhost:8001/c8y/tenant/currentTenant >/dev/null; then
        PROXY_FAILURES=$((PROXY_FAILURES + 1))
        RESULT="FAILED"
    fi
    end=$(date +%s)
    duration=$((end - start))
    printf 'proxy\t%s\t%ds\n' "$RESULT" "$duration"

    # using curl
    start=$(date +%s)
    RESULT="PASSED"
    
    if ! curl -sf "https://$C8Y_URL/tenant/loginOptions" >/dev/null; then
        CURL_FAILURES=$((CURL_FAILURES + 1))
        RESULT="FAILED"
    fi
    end=$(date +%s)
    duration=$((end - start))
    printf 'curl\t%s\t%ds\n' "$RESULT" "$duration"
}

COUNTER=1
TOTAL_TRIES=10
echo
while [ "$COUNTER" -le "$TOTAL_TRIES" ]; do
    check_url "$COUNTER"
    COUNTER=$((COUNTER + 1))
    sleep 5
done

printf -- '-------------------------\n'
printf -- '------- Summary ---------\n'
printf 'Attempts: %d\n' "$TOTAL_TRIES"
printf 'proxy failures: %d\n' "$PROXY_FAILURES"
printf 'curl failures: %d\n' "$CURL_FAILURES"

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingtheme:c8yTheme: Cumulocity related topics

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions