Skip to content

[Windows] Address status/fix various flaky_on_windows tests#13837

Merged
ggreenway merged 5 commits intoenvoyproxy:masterfrom
greenhouse-org:windows-fix-flakes
Nov 3, 2020
Merged

[Windows] Address status/fix various flaky_on_windows tests#13837
ggreenway merged 5 commits intoenvoyproxy:masterfrom
greenhouse-org:windows-fix-flakes

Conversation

@wrowe
Copy link
Copy Markdown
Contributor

@wrowe wrowe commented Oct 30, 2020

Commit Message: [Windows] Address status/fix various flaky_on_windows tests
Additional Description:

  • Correct listener_impl_test, fixes
    TcpListenerImplTest.SetListenerRejectFractionIntermediate
    These fixes reflect that the timing on windows is often less
    synchronous than linux observations, due to the design of the
    underlying socket stack.
  • In QuicHttpIntegrationTest, the MultipleQuicConnectionsWithBPF is
    never relevant to Windows or macOS (and took a not-insignificant
    amount of time per test case.)
  • Enable no-longer-failing tests (no failure observed in RBE or locally)
    //test/extensions/grpc_credentials/file_based_metadata:integration_test
    //test/extensions/transport_sockets/alts:tsi_handshaker_test
    //test/integration:http2_flood_integration_test
    //test/integration:http2_upstream_integration_test
    //test/integration:overload_integration_test
    //test/server:guarddog_impl_test

Co-authored-by: Sunjay Bhatia sunjayb@vmware.com
Co-authored-by: William A Rowe Jr wrowe@vmware.com

Risk Level: Low (test enablement only)
Testing: Windows msvc-cl and clang-cl RBE and locally
Docs Changes: n/a
Release Notes: n/a
Platform Specific Features: n/a

These fixes reflect that the timing on windows is often less
synchronous than linux observations, due to the design of the
underlying socket stack.

- Correct listener_impl_test, fixes
    TcpListenerImplTest.SetListenerRejectFractionIntermediate

- In QuicHttpIntegrationTest, the MultipleQuicConnectionsWithBPF is
  never relevant to Windows or macOS (and took a not-insigificant
  amount of time per test case.)

- Enable no-longer-failing tests (no failure observed in RBE or locally)
    //test/extensions/grpc_credentials/file_based_metadata:integration_test
    //test/extensions/transport_sockets/alts:tsi_handshaker_test
    //test/integration:http2_flood_integration_test
    //test/integration:http2_upstream_integration_test
    //test/integration:overload_integration_test
    //test/server:guarddog_impl_test

Co-authored-by: Sunjay Bhatia <sunjayb@vmware.com>
Co-authored-by: William A Rowe Jr <wrowe@vmware.com>
Signed-off-by: Sunjay Bhatia <sunjayb@vmware.com>
Signed-off-by: William A Rowe Jr <wrowe@vmware.com>
@sunjayBhatia
Copy link
Copy Markdown
Member

@envoyproxy/windows-dev

}

// Only run on platforms that support BPF
#if defined(SO_ATTACH_REUSEPORT_CBPF) && defined(__linux__)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just if defined(SO_ATTACH_REUSEPORT_CBPF)?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we matched what was in here:

#if defined(SO_ATTACH_REUSEPORT_CBPF) && defined(__linux__)

looks like the test has undergone this change already looking back at the blame

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see this diff: f8c3acc#diff-5872f75ef2969ee19f048c4dfc487bb7acc649021940a8acca7a4921b2d5c910

not sure why exactly the ifdef was taken out for platforms that do not support BPF

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davinci26
Copy link
Copy Markdown
Member

LGTM

davinci26
davinci26 previously approved these changes Nov 2, 2020
testing::InSequence s1;
EXPECT_CALL(random_generator, random()).WillOnce(Return(std::numeric_limits<uint64_t>::max()));
EXPECT_CALL(listener_callbacks, onAccept_(_));
EXPECT_CALL(listener_callbacks, onAccept_(_)).WillOnce([&] { dispatcher_->exit(); });
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain this change? It's not clear to me why this is better than the old version.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

making the dispatcher exit on the connected event makes the test racy, the dispatcher could exit before there is a chance to invoke the tcp listener onAccept callback which is effectively the last thing that happens when the connection is established/accepted

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see. I thought the events happened in the opposite order. Sounds good.

}

// Only run on platforms that support BPF
#if defined(SO_ATTACH_REUSEPORT_CBPF) && defined(__linux__)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why ifdef this out? Doesn't it pass on non-bpf platforms? (I'm basing that assumption on the comment below that the runtime override is a no-op on non-bpf platforms).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does pass, but may give the wrong impress on non-BPF platforms for someone taking a quick glance that BPF support somehow works, ifdef-ing it out is mainly a clarification

Also, we are having issues with this test timing out, not running the various parameterized runs of this test limits test runtime a bit

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I think the test really should be renamed, and always run. The name should probably be "Default mode" or something. It is testing that, given no extra config, quic works on this platform, which is why I think it should run on all platforms.

It is doing the same thing as the next test on non-bpf platforms, but the goal is to ensure that both configurations work on all platforms.

Signed-off-by: Sunjay Bhatia <sunjayb@vmware.com>
Signed-off-by: Sunjay Bhatia <sunjayb@vmware.com>

TEST_P(QuicHttpIntegrationTest, MultipleQuicConnectionsWithBPF) { testMultipleQuicConnections(); }
// Ensure multiple quic connections work, regardless of platform BPF support
TEST_P(QuicHttpIntegrationTest, MultipleQuicConnectionsDefaultMode) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ggreenway renamed, let me know what you think (removed BPF from the name)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that looks better, thanks

Signed-off-by: Sunjay Bhatia <sunjayb@vmware.com>
Copy link
Copy Markdown
Member

@ggreenway ggreenway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, modulo one minor nit

testing::InSequence s1;
EXPECT_CALL(random_generator, random()).WillOnce(Return(std::numeric_limits<uint64_t>::max()));
// Exiting dispatcher on client side connect event can cause a race, listener accept callback
// may not have been triggered, exit dispatcher here to prevent this
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: period at end of sentence.

Signed-off-by: Sunjay Bhatia <sunjayb@vmware.com>
@ggreenway ggreenway self-assigned this Nov 2, 2020
@ggreenway ggreenway merged commit 5df0a6b into envoyproxy:master Nov 3, 2020
@sunjayBhatia sunjayBhatia deleted the windows-fix-flakes branch November 3, 2020 20:09
@wrowe wrowe removed this from the windows-ga milestone Jan 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants