Skip to content

integration: Configure drain time in integration tests#11345

Merged
mattklein123 merged 5 commits intoenvoyproxy:masterfrom
aunu53:integration
Jun 2, 2020
Merged

integration: Configure drain time in integration tests#11345
mattklein123 merged 5 commits intoenvoyproxy:masterfrom
aunu53:integration

Conversation

@aunu53
Copy link
Copy Markdown
Contributor

@aunu53 aunu53 commented May 28, 2020

integration: Configure drain time in integration tests

Risk Level: Test-only change
Testing: Test-only change

Fixes #11240

Auni Ahsan added 3 commits May 28, 2020 13:37
Signed-off-by: Auni Ahsan <auni@google.com>
Signed-off-by: Auni Ahsan <auni@google.com>
Signed-off-by: Auni Ahsan <auni@google.com>
@aunu53 aunu53 changed the title Allow configurable drain time in integration tests integration: Configure drain time in integration tests May 28, 2020
@aunu53
Copy link
Copy Markdown
Contributor Author

aunu53 commented May 29, 2020

/retest

@repokitteh-read-only
Copy link
Copy Markdown

🤷‍♀️ nothing to rebuild.

🐱

Caused by: a #11345 (comment) was created by @auni53.

see: more, trace.

@aunu53
Copy link
Copy Markdown
Contributor Author

aunu53 commented May 29, 2020

@mattklein123 this has the retry test strategy we discussed earlier this week. I don't know what's going on with the presubmits.

@mattklein123 mattklein123 self-assigned this May 29, 2020
Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks LGTM modulo question about length. Sorry for the confusion about what Alyssa was working on, my bad.

/wait


// Add a health check filter and verify correct behavior when draining.
TEST_P(ProtocolIntegrationTest, DrainClose) {
drain_time_ = std::chrono::seconds(10);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this really have to be 10s? I think @alyssawilk is not going to be happy with this. :)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Command: > bazel test test/integration:protocol_integration_test --test_filter="*DrainClose/IPv6_Http2Downstream_HttpUpstream" --test_sharding_strategy=disabled --runs_per_test=10

  • 3s drain time: Stats over 10 runs: max = 7.5s, min = 7.3s, avg = 7.4s, dev = 0.1s
  • 10s drain time: Stats over 10 runs: max = 7.6s, min = 7.3s, avg = 7.4s, dev = 0.1s

I'm fine with whatever, but doesn't seem to be a big difference. (In the long timeout there'll probably just be more retries but those happen quickly.)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, while I'm not a fan of long sleeping tests, I don't think this affects test timing.
Offhand I think it would result in a 10s pause if we didn't get a request and send a response (if the drain close kicked in when waiting for a new pipelined request) but when we get a new request we respond with connection close and then close the connection after the delay close (not drain time) timeout.

In cases like this it might be good to actually set it super high so the test would time out if we ever tickled the limit, and comment why it's not going to actually slow the test down.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand your explanation. My understanding was that the probability for drain close might be low, but the test will rapidly send requests until a drain close is hit, so the test duration won't generally be increased. I verified this by adding a retry counter and setting the timeout to 100s, and found that the number of requests sent ranged from 15 to 210. This ultimately happens so quickly that the test time isn't really increased.

I bumped the timeout to 100s and added a comment to explain.

Signed-off-by: Auni Ahsan <auni@google.com>
@aunu53
Copy link
Copy Markdown
Contributor Author

aunu53 commented Jun 1, 2020

/retest

@repokitteh-read-only
Copy link
Copy Markdown

🤷‍♀️ nothing to rebuild.

🐱

Caused by: a #11345 (comment) was created by @auni53.

see: more, trace.

Signed-off-by: Auni Ahsan <auni@google.com>
Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@mattklein123 mattklein123 merged commit 190cce0 into envoyproxy:master Jun 2, 2020
@aunu53 aunu53 deleted the integration branch June 2, 2020 13:06
jianwen612 pushed a commit to jianwen612/envoy that referenced this pull request Jun 10, 2020
)

Signed-off-by: Auni Ahsan <auni@google.com>
Signed-off-by: jianwen <jianwendong@google.com>
yashwant121 pushed a commit to yashwant121/envoy that referenced this pull request Jul 24, 2020
)

Signed-off-by: Auni Ahsan <auni@google.com>
Signed-off-by: yashwant121 <yadavyashwant36@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Race condition between drain close and integration server shutdown

3 participants