Skip to content

test: Run KVStore flakes in a loop#12419

Closed
pchaigno wants to merge 1 commit intomasterfrom
pr/pchaigno/test-kvstore-flakes
Closed

test: Run KVStore flakes in a loop#12419
pchaigno wants to merge 1 commit intomasterfrom
pr/pchaigno/test-kvstore-flakes

Conversation

@pchaigno
Copy link
Copy Markdown
Member

@pchaigno pchaigno commented Jul 6, 2020

No description provided.

@pchaigno pchaigno added the release-note/misc This PR makes changes that have no direct user impact. label Jul 6, 2020
@pchaigno pchaigno force-pushed the pr/pchaigno/test-kvstore-flakes branch 2 times, most recently from ecb3131 to c1dd2f6 Compare July 6, 2020 12:10
@pchaigno
Copy link
Copy Markdown
Member Author

pchaigno commented Jul 6, 2020

test-focus RuntimeKVStore
Ran 176 times successfully: https://jenkins.cilium.io/job/Cilium-PR-Ginkgo-Tests-Validated-Focus/337/consoleFull

@pchaigno
Copy link
Copy Markdown
Member Author

pchaigno commented Jul 6, 2020

test-focus RuntimeKVStore
Ran 176 times successfully: https://jenkins.cilium.io/job/Cilium-PR-Ginkgo-Tests-Validated-Focus/338/consoleFull

@pchaigno
Copy link
Copy Markdown
Member Author

pchaigno commented Jul 6, 2020

test-focus RuntimeKVStore
Ran 176 times successfully: https://jenkins.cilium.io/job/Cilium-PR-Ginkgo-Tests-Validated-Focus/340/consoleFull

@pchaigno
Copy link
Copy Markdown
Member Author

pchaigno commented Jul 6, 2020

retest-runtime
Other flakes but RuntimeKVStore all green.

@pchaigno
Copy link
Copy Markdown
Member Author

pchaigno commented Jul 7, 2020

retest-runtime

@pchaigno pchaigno force-pushed the pr/pchaigno/test-kvstore-flakes branch from c1dd2f6 to fada2db Compare July 7, 2020 12:43
@coveralls
Copy link
Copy Markdown

coveralls commented Jul 7, 2020

Coverage Status

Coverage decreased (-0.05%) to 36.895% when pulling 714db39 on pr/pchaigno/test-kvstore-flakes into 56e0e72 on master.

@pchaigno pchaigno force-pushed the pr/pchaigno/test-kvstore-flakes branch from fada2db to 4d0539f Compare July 7, 2020 15:31
@pchaigno
Copy link
Copy Markdown
Member Author

pchaigno commented Jul 7, 2020

@pchaigno pchaigno force-pushed the pr/pchaigno/test-kvstore-flakes branch 2 times, most recently from 34d6dea to a791dc5 Compare July 7, 2020 18:13
@pchaigno
Copy link
Copy Markdown
Member Author

pchaigno commented Jul 7, 2020

@pchaigno pchaigno force-pushed the pr/pchaigno/test-kvstore-flakes branch 2 times, most recently from 636fb9c to 5ca6e2e Compare July 7, 2020 20:06
@pchaigno
Copy link
Copy Markdown
Member Author

pchaigno commented Jul 7, 2020

@pchaigno pchaigno force-pushed the pr/pchaigno/test-kvstore-flakes branch 2 times, most recently from d0bfc75 to 5639905 Compare July 8, 2020 06:42
@pchaigno
Copy link
Copy Markdown
Member Author

pchaigno commented Jul 8, 2020

retest-runtime
All green :-(

@pchaigno
Copy link
Copy Markdown
Member Author

pchaigno commented Jul 8, 2020

retest-runtime
Flaked multiple times: https://jenkins.cilium.io/job/Cilium-PR-Runtime-4.9/1120/

@pchaigno pchaigno force-pushed the pr/pchaigno/test-kvstore-flakes branch from 5639905 to a1c9815 Compare July 8, 2020 16:37
@pchaigno
Copy link
Copy Markdown
Member Author

pchaigno commented Jul 8, 2020

retest-runtime
All green... ??

@pchaigno
Copy link
Copy Markdown
Member Author

pchaigno commented Jul 8, 2020

@pchaigno pchaigno force-pushed the pr/pchaigno/test-kvstore-flakes branch from a1c9815 to 48e48f2 Compare July 9, 2020 05:44
@pchaigno
Copy link
Copy Markdown
Member Author

pchaigno commented Jul 9, 2020

@pchaigno pchaigno force-pushed the pr/pchaigno/test-kvstore-flakes branch from 48e48f2 to 1dd0ad3 Compare July 9, 2020 09:43
@pchaigno
Copy link
Copy Markdown
Member Author

pchaigno commented Jul 9, 2020

Signed-off-by: Paul Chaignon <paul@cilium.io>
@pchaigno pchaigno force-pushed the pr/pchaigno/test-kvstore-flakes branch from 1dd0ad3 to 714db39 Compare July 9, 2020 13:37
@pchaigno
Copy link
Copy Markdown
Member Author

pchaigno commented Jul 9, 2020

retest-runtime
All green... https://jenkins.cilium.io/job/Cilium-PR-Runtime-4.9/1152/

@pchaigno
Copy link
Copy Markdown
Member Author

pchaigno commented Jul 9, 2020

retest-runtime
All green :-)) https://jenkins.cilium.io/job/Cilium-PR-Runtime-4.9/1157/

pchaigno added a commit that referenced this pull request Jul 9, 2020
As for most tests, at the end of RuntimeKVStoreTest, we validate the logs
don't contain any worrisome messages with:

    vm.ValidateNoErrorsOnLogs(CurrentGinkgoTestDescription().Duration)

CurrentGinkgoTestDescription().Duration includes the execution of all
ginkgo.BeforeEach [1]. In RuntimeKVStoreTest, one of our
ginkgo.BeforeEach stops the Cilium systemd service (because we run
cilium-agent as a standalone binary in the test itself). Stopping Cilium
can result in worrisome messages in the logs e.g., if the compilation of
BPF programs is terminated abruptly. This in turn makes the tests fail
once in a while.

To fix this, we can replace CurrentGinkgoTestDescription().Duration with
our own "time counter" that doesn't include any of the
ginkgo.BeforeEach executions.

To validate this fix, I ran the whole RuntimeKVStoreTest with this
change 60 times locally and 60 times in the CI (#12419). The tests
passed all 120 times. Before applying the fix, the Consul test would
fail ~1/30 times, both locally and in CI.

1 - https://github.com/onsi/ginkgo/blob/9c254cb251dc962dc20ca91d0279c870095cfcf9/internal/spec/spec.go#L132-L134
Fixes: #11895
Fixes: 5185789 ("Test: Checks for deadlocks panics in logs per each test.")
Related: #12419
Signed-off-by: Paul Chaignon <paul@cilium.io>
@pchaigno pchaigno closed this Jul 9, 2020
@pchaigno pchaigno deleted the pr/pchaigno/test-kvstore-flakes branch July 9, 2020 18:28
nebril pushed a commit that referenced this pull request Jul 10, 2020
As for most tests, at the end of RuntimeKVStoreTest, we validate the logs
don't contain any worrisome messages with:

    vm.ValidateNoErrorsOnLogs(CurrentGinkgoTestDescription().Duration)

CurrentGinkgoTestDescription().Duration includes the execution of all
ginkgo.BeforeEach [1]. In RuntimeKVStoreTest, one of our
ginkgo.BeforeEach stops the Cilium systemd service (because we run
cilium-agent as a standalone binary in the test itself). Stopping Cilium
can result in worrisome messages in the logs e.g., if the compilation of
BPF programs is terminated abruptly. This in turn makes the tests fail
once in a while.

To fix this, we can replace CurrentGinkgoTestDescription().Duration with
our own "time counter" that doesn't include any of the
ginkgo.BeforeEach executions.

To validate this fix, I ran the whole RuntimeKVStoreTest with this
change 60 times locally and 60 times in the CI (#12419). The tests
passed all 120 times. Before applying the fix, the Consul test would
fail ~1/30 times, both locally and in CI.

1 - https://github.com/onsi/ginkgo/blob/9c254cb251dc962dc20ca91d0279c870095cfcf9/internal/spec/spec.go#L132-L134
Fixes: #11895
Fixes: 5185789 ("Test: Checks for deadlocks panics in logs per each test.")
Related: #12419
Signed-off-by: Paul Chaignon <paul@cilium.io>
pchaigno added a commit that referenced this pull request Jul 21, 2020
[ upstream commit e558100 ]

As for most tests, at the end of RuntimeKVStoreTest, we validate the logs
don't contain any worrisome messages with:

    vm.ValidateNoErrorsOnLogs(CurrentGinkgoTestDescription().Duration)

CurrentGinkgoTestDescription().Duration includes the execution of all
ginkgo.BeforeEach [1]. In RuntimeKVStoreTest, one of our
ginkgo.BeforeEach stops the Cilium systemd service (because we run
cilium-agent as a standalone binary in the test itself). Stopping Cilium
can result in worrisome messages in the logs e.g., if the compilation of
BPF programs is terminated abruptly. This in turn makes the tests fail
once in a while.

To fix this, we can replace CurrentGinkgoTestDescription().Duration with
our own "time counter" that doesn't include any of the
ginkgo.BeforeEach executions.

To validate this fix, I ran the whole RuntimeKVStoreTest with this
change 60 times locally and 60 times in the CI (#12419). The tests
passed all 120 times. Before applying the fix, the Consul test would
fail ~1/30 times, both locally and in CI.

1 - https://github.com/onsi/ginkgo/blob/9c254cb251dc962dc20ca91d0279c870095cfcf9/internal/spec/spec.go#L132-L134
Fixes: #11895
Fixes: 5185789 ("Test: Checks for deadlocks panics in logs per each test.")
Related: #12419
Signed-off-by: Paul Chaignon <paul@cilium.io>
rolinh pushed a commit that referenced this pull request Jul 21, 2020
[ upstream commit e558100 ]

As for most tests, at the end of RuntimeKVStoreTest, we validate the logs
don't contain any worrisome messages with:

    vm.ValidateNoErrorsOnLogs(CurrentGinkgoTestDescription().Duration)

CurrentGinkgoTestDescription().Duration includes the execution of all
ginkgo.BeforeEach [1]. In RuntimeKVStoreTest, one of our
ginkgo.BeforeEach stops the Cilium systemd service (because we run
cilium-agent as a standalone binary in the test itself). Stopping Cilium
can result in worrisome messages in the logs e.g., if the compilation of
BPF programs is terminated abruptly. This in turn makes the tests fail
once in a while.

To fix this, we can replace CurrentGinkgoTestDescription().Duration with
our own "time counter" that doesn't include any of the
ginkgo.BeforeEach executions.

To validate this fix, I ran the whole RuntimeKVStoreTest with this
change 60 times locally and 60 times in the CI (#12419). The tests
passed all 120 times. Before applying the fix, the Consul test would
fail ~1/30 times, both locally and in CI.

1 - https://github.com/onsi/ginkgo/blob/9c254cb251dc962dc20ca91d0279c870095cfcf9/internal/spec/spec.go#L132-L134
Fixes: #11895
Fixes: 5185789 ("Test: Checks for deadlocks panics in logs per each test.")
Related: #12419
Signed-off-by: Paul Chaignon <paul@cilium.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-note/misc This PR makes changes that have no direct user impact.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants