Skip to content

Tests fail in some environments #1445

@NightTsarina

Description

@NightTsarina

Hi, I have just uploaded 0.15.0 to Debian, and immediately got test failures from 4 arches. I was ready to attribute this to some arch problem, but looking at the code, and at #1434, I think this is a problem with how memberlist deals with IP addresses (which seems pretty hackish, btw, and does not seem to take into account ipv6 properly).

These are the errors:

=== RUN   TestJoinLeave
--- FAIL: TestJoinLeave (0.01s)
	require.go:794: 
			Error Trace:	cluster_test.go:44
			Error:      	Received unexpected error:
			            	Failed to get final advertise address: Failed to parse advertise address "<nil>"
			            	create memberlist
			            	github.com/prometheus/alertmanager/cluster.Join
			            		/home/tincho/prometheus-alertmanager-0.15.0+ds/build/src/github.com/prometheus/alertmanager/cluster/cluster.go:207
			            	github.com/prometheus/alertmanager/cluster.TestJoinLeave
			            		/home/tincho/prometheus-alertmanager-0.15.0+ds/build/src/github.com/prometheus/alertmanager/cluster/cluster_test.go:29
			            	testing.tRunner
			            		/usr/lib/go-1.10/src/testing/testing.go:777
			            	runtime.goexit
			            		/usr/lib/go-1.10/src/runtime/asm_s390x.s:986
			Test:       	TestJoinLeave
=== RUN   TestReconnect
--- FAIL: TestReconnect (0.03s)
	require.go:794: 
			Error Trace:	cluster_test.go:97
			Error:      	Received unexpected error:
			            	Failed to get final advertise address: Failed to parse advertise address "<nil>"
			            	create memberlist
			            	github.com/prometheus/alertmanager/cluster.Join
			            		/home/tincho/prometheus-alertmanager-0.15.0+ds/build/src/github.com/prometheus/alertmanager/cluster/cluster.go:207
			            	github.com/prometheus/alertmanager/cluster.TestReconnect
			            		/home/tincho/prometheus-alertmanager-0.15.0+ds/build/src/github.com/prometheus/alertmanager/cluster/cluster_test.go:82
			            	testing.tRunner
			            		/usr/lib/go-1.10/src/testing/testing.go:777
			            	runtime.goexit
			            		/usr/lib/go-1.10/src/runtime/asm_s390x.s:986
			Test:       	TestReconnect
=== RUN   TestRemoveFailedPeers
--- FAIL: TestRemoveFailedPeers (0.01s)
	require.go:794: 
			Error Trace:	cluster_test.go:152
			Error:      	Received unexpected error:
			            	Failed to get final advertise address: Failed to parse advertise address "<nil>"
			            	create memberlist
			            	github.com/prometheus/alertmanager/cluster.Join
			            		/home/tincho/prometheus-alertmanager-0.15.0+ds/build/src/github.com/prometheus/alertmanager/cluster/cluster.go:207
			            	github.com/prometheus/alertmanager/cluster.TestRemoveFailedPeers
			            		/home/tincho/prometheus-alertmanager-0.15.0+ds/build/src/github.com/prometheus/alertmanager/cluster/cluster_test.go:137
			            	testing.tRunner
			            		/usr/lib/go-1.10/src/testing/testing.go:777
			            	runtime.goexit
			            		/usr/lib/go-1.10/src/runtime/asm_s390x.s:986
			Test:       	TestRemoveFailedPeers
=== RUN   TestInitiallyFailingPeers
--- FAIL: TestInitiallyFailingPeers (0.01s)
	require.go:794: 
			Error Trace:	cluster_test.go:198
			Error:      	Received unexpected error:
			            	Failed to get final advertise address: Failed to parse advertise address "<nil>"
			            	create memberlist
			            	github.com/prometheus/alertmanager/cluster.Join
			            		/home/tincho/prometheus-alertmanager-0.15.0+ds/build/src/github.com/prometheus/alertmanager/cluster/cluster.go:207
			            	github.com/prometheus/alertmanager/cluster.TestInitiallyFailingPeers
			            		/home/tincho/prometheus-alertmanager-0.15.0+ds/build/src/github.com/prometheus/alertmanager/cluster/cluster_test.go:183
			            	testing.tRunner
			            		/usr/lib/go-1.10/src/testing/testing.go:777
			            	runtime.goexit
			            		/usr/lib/go-1.10/src/runtime/asm_s390x.s:986
			Test:       	TestInitiallyFailingPeers
FAIL
FAIL	github.com/prometheus/alertmanager/cluster	0.064s

And it seems the code at fault is trying to select a "private" IP address, failing, and not giving an useful error message. This seems to be assuming way too much about the environment, in particular a test environment, where RFC1918 addresses might not be present at all.

I replaced the INADDR_ANY addresses with loopback addresses, and now the tests pass:

--- cluster/cluster_test.go	2018-06-28 15:00:17.000000000 +0000
+++ build/src/github.com/prometheus/alertmanager/cluster/cluster_test.go	2018-06-29 11:24:08.258809245 +0000
@@ -29,7 +29,7 @@
 	p, err := Join(
 		logger,
 		prometheus.NewRegistry(),
-		"0.0.0.0:0",
+		"127.0.0.1:0",
 		"",
 		[]string{},
 		true,
@@ -53,7 +53,7 @@
 	p2, err := Join(
 		logger,
 		prometheus.NewRegistry(),
-		"0.0.0.0:0",
+		"127.0.0.2:0",
 		"",
 		[]string{p.Self().Address()},
 		true,
@@ -82,7 +82,7 @@
 	p, err := Join(
 		logger,
 		prometheus.NewRegistry(),
-		"0.0.0.0:0",
+		"127.0.0.1:0",
 		"",
 		[]string{},
 		true,
@@ -102,7 +102,7 @@
 	p2, err := Join(
 		logger,
 		prometheus.NewRegistry(),
-		"0.0.0.0:0",
+		"127.0.0.2:0",
 		"",
 		[]string{},
 		true,
@@ -137,7 +137,7 @@
 	p, err := Join(
 		logger,
 		prometheus.NewRegistry(),
-		"0.0.0.0:0",
+		"127.0.0.1:0",
 		"",
 		[]string{},
 		true,
@@ -183,7 +183,7 @@
 	p, err := Join(
 		logger,
 		prometheus.NewRegistry(),
-		"0.0.0.0:0",
+		"127.0.0.1:0",
 		"",
 		[]string{},
 		true,

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions