Skip to content

vagrant: update 4.19 and net-next VM images#18496

Merged
gandro merged 1 commit intomasterfrom
pr/update-packer-ci-build
Jan 17, 2022
Merged

vagrant: update 4.19 and net-next VM images#18496
gandro merged 1 commit intomasterfrom
pr/update-packer-ci-build

Conversation

@nbusseneau
Copy link
Copy Markdown
Member

We have been hitting issues in our CI since yesterday on all master scheduled pipelines as well as master-based PR pipelines, with Cilium Agent seemingly being unable to reach a healthy state due to IPv6-related issues (see [1] and [2] below).

I have a hunch the issue is due to the Vagrant VMs themselves, as we do not seem to hit the issue anymore on 4.9- and 5.4-based pipelines after merging an update to their Vagrant VMs in 3562520, containing an unrelated fix.

This commit bumps the 4.19 and net-next VM images as well.

[1]

2022-01-13T05:42:17.179701398Z level=debug msg="Greeting failed" error="Get \"http://[fd04::12]:4240/hello\": dial tcp [fd04::12]:4240: connect: network is unreachable" host="http://[fd04::12]:4240" ipAddr="fd04::12" nodeName=k8s2 path="Via L3" subsys=health-server
2022-01-13T05:42:17.179730123Z level=debug msg="Failed to probe: Get \"http://[fd04::12]:4240/hello\": dial tcp [fd04::12]:4240: connect: network is unreachable" ipAddr="fd04::12" nodeName=k8s2 port=4240 subsys=health-server

[2]

2022-01-14T12:12:52.181387748Z level=fatal msg="Error while creating daemon" error="failed to finalise LB initialization: failed to initialize NodePort addrs. : Failed to determine IPv6 of enp0s3 for NodePort" subsys=daemon

We have been hitting issues in our CI since yesterday on all `master`
scheduled pipelines as well as `master`-based PR pipelines, with Cilium
Agent seemingly being unable to reach a healthy state due to
IPv6-related issues (see [1] and [2] below).

I have a hunch the issue is due to the Vagrant VMs themselves, as we do
not seem to hit the issue anymore on 4.9- and 5.4-based pipelines after
merging an update to their Vagrant VMs in
3562520, containing an unrelated fix.

This commit bumps the 4.19 and net-next VM images as well.

[1]

```
2022-01-13T05:42:17.179701398Z level=debug msg="Greeting failed" error="Get \"http://[fd04::12]:4240/hello\": dial tcp [fd04::12]:4240: connect: network is unreachable" host="http://[fd04::12]:4240" ipAddr="fd04::12" nodeName=k8s2 path="Via L3" subsys=health-server
2022-01-13T05:42:17.179730123Z level=debug msg="Failed to probe: Get \"http://[fd04::12]:4240/hello\": dial tcp [fd04::12]:4240: connect: network is unreachable" ipAddr="fd04::12" nodeName=k8s2 port=4240 subsys=health-server
```

[2]

```
2022-01-14T12:12:52.181387748Z level=fatal msg="Error while creating daemon" error="failed to finalise LB initialization: failed to initialize NodePort addrs. : Failed to determine IPv6 of enp0s3 for NodePort" subsys=daemon
```

Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com>
@nbusseneau nbusseneau added area/CI Continuous Integration testing issue or flake release-note/ci This PR makes changes to the CI. labels Jan 17, 2022
@nbusseneau nbusseneau requested a review from a team as a code owner January 17, 2022 10:18
@nbusseneau
Copy link
Copy Markdown
Member Author

This is a re-open of #18493 with the proper branch name. As noted in the original PR, CI passed:

I still have no idea why it works but both 4.19 and net-next tests are passing (as in: they are not failing with the agent startup issue above).

Copy link
Copy Markdown
Member

@tklauser tklauser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@nbusseneau nbusseneau added ready-to-merge This PR has passed all tests and received consensus from code owners to merge. and removed ready-to-merge This PR has passed all tests and received consensus from code owners to merge. labels Jan 17, 2022
@nbusseneau
Copy link
Copy Markdown
Member Author

nbusseneau commented Jan 17, 2022

Forgot we need to wait for the runtime tests as well: https://jenkins.cilium.io/view/PR/job/Cilium-PR-Runtime-net-next/1026/

EDIT: but already we can tell by looking at the logs that it does not hit the agent startup issue either. Marking as ready to merge.

@nbusseneau nbusseneau added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Jan 17, 2022
@maintainer-s-little-helper
Copy link
Copy Markdown

Job 'Cilium-PR-Runtime-net-next' failed and has not been observed before, so may be related to your PR:

Click to show.

Test Name

RuntimePrivilegedUnitTests Run Tests

Failure Output

FAIL: Failed to run privileged unit tests

If it is a flake, comment /mlh new-flake Cilium-PR-Runtime-net-next so I can create a new GitHub issue to track it.

@gandro gandro merged commit 8bf4e22 into master Jan 17, 2022
@gandro gandro deleted the pr/update-packer-ci-build branch January 17, 2022 10:48
@nbusseneau
Copy link
Copy Markdown
Member Author

nbusseneau commented Jan 17, 2022

/mlh new-flake Cilium-PR-Runtime-net-next

👍 created #18497

gandro added a commit to gandro/cilium that referenced this pull request Jan 17, 2022
Since the net-next version bump in cilium#18496, this test is consistently
failing. Unfortunately, the net-next bump was necessary to fix a bigger
issue where all pipelines were broken, so let us disable that test for
now.

Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/CI Continuous Integration testing issue or flake ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/ci This PR makes changes to the CI.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants