Skip to content

Fix hot restart bug due to passed an fd without bind#8426

Merged
mattklein123 merged 5 commits intoenvoyproxy:masterfrom
zyfjeff:fix-hotrestart-bug
Oct 4, 2019
Merged

Fix hot restart bug due to passed an fd without bind#8426
mattklein123 merged 5 commits intoenvoyproxy:masterfrom
zyfjeff:fix-hotrestart-bug

Conversation

@zyfjeff
Copy link
Copy Markdown
Member

@zyfjeff zyfjeff commented Sep 29, 2019

When the 15001 listener appears at the same time, one of them will bind the socket and listening, and the other will not bind. In this case, the hot restart gets the socket of the previous process and may get the socket that is not listening.

Description: Fix hot restart bug due to passed an fd without bind
Risk Level: low
Testing: Existing tests and new tests
Docs Changes: N/A
Release Notes: N/A
[Optional Fixes #Issue] #8427

Signed-off-by: tianqian.zyf <tianqian.zyf@alibaba-inc.com>
@zyfjeff zyfjeff changed the title WIP: Fix hot restart bug due to passed an fd without bind Fix hot restart bug due to passed an fd without bind Sep 29, 2019
Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good a at a high level, thanks. Some small comments.

/wait

: Network::Utility::UDP_SCHEME;
const std::string addr = absl::StrCat(scheme, address->asString());

if (!bind_to_port && socket_type == Network::Address::SocketType::Stream) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed? Won't this just return -1 below once you check for not binding in the parent?

Copy link
Copy Markdown
Member Author

@zyfjeff zyfjeff Oct 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the need to determine whether the bind, if there is no bind socket to the parent process to obtain the socket, may get the already bind socket, in addition, there is no way to handle the traffic without the bind socket, we can not go to the parent process to obtain the socket, directly create it.

Network::Address::InstanceConstSharedPtr addr =
Network::Utility::resolveUrl(request.pass_listen_socket().address());
for (const auto& listener : server_->listenerManager().listeners()) {
if (listener.get().socket().socketType() == Network::Address::SocketType::Stream &&
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pedantically, do we need the stream type check? Isn't just checking for bind to port sufficient?

Copy link
Copy Markdown
Member Author

@zyfjeff zyfjeff Oct 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right, I understand it wrong, udp socket does not have bind to port, there is no way to handle traffic., so we just need to judge whether bind to port is enough.

Signed-off-by: tianqian.zyf <tianqian.zyf@alibaba-inc.com>
@zyfjeff zyfjeff force-pushed the fix-hotrestart-bug branch from cae0fd1 to 6347e39 Compare October 1, 2019 02:42
Signed-off-by: tianqian.zyf <tianqian.zyf@alibaba-inc.com>
@zyfjeff
Copy link
Copy Markdown
Member Author

zyfjeff commented Oct 1, 2019

/retest

@repokitteh-read-only
Copy link
Copy Markdown

🔨 rebuilding ci/circleci: coverage (failed build)

🐱

Caused by: a #8426 (comment) was created by @zyfjeff.

see: more, trace.

@mattklein123
Copy link
Copy Markdown
Member

@zyfjeff friendly request to please not force push. It makes reviewing more difficult. Thank you!

Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this makes sense to me. One test request .Thank you!

/wait

} else {
return std::make_shared<Network::UdpListenSocket>(std::move(io_handle), address, options);

if (bind_to_port) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a listener manager test for this case also?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the idea of ​​my test case is to test when the bind_to_port is equal to false, not to get the socket from the hot restart, but to create the socket directly.

Signed-off-by: tianqian.zyf <tianqian.zyf@alibaba-inc.com>
Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM with some small comments.

/wait

Network::Address::SocketType socket_type,
const Network::Socket::OptionsSharedPtr& options,
bool bind_to_port) -> Network::SocketSharedPtr {
// When bind_to_port is equal to false, create socket fd directly, and do not get socket
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add an explicit expectation with Times(0) to the mock hot restarter to make sure we never call it?

class ListenerManagerImplTest : public testing::Test {
protected:
ListenerManagerImplTest() : api_(Api::createApiForTest()) {
ListenerManagerImplTest() : api_(Api::createApiForTest()), real_listener_factory_(server_) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you just define this in the test where you use it for now as we don't use it anywhere else?

Signed-off-by: tianqian.zyf <tianqian.zyf@alibaba-inc.com>
Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks.

@mattklein123
Copy link
Copy Markdown
Member

/azp run envoy-linux

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@mattklein123 mattklein123 merged commit fcb3cf1 into envoyproxy:master Oct 4, 2019
danzh2010 pushed a commit to danzh2010/envoy that referenced this pull request Oct 4, 2019
Signed-off-by: tianqian.zyf <tianqian.zyf@alibaba-inc.com>
@zyfjeff zyfjeff deleted the fix-hotrestart-bug branch October 12, 2019 02:32
nandu-vinodan pushed a commit to nandu-vinodan/envoy that referenced this pull request Oct 17, 2019
Signed-off-by: tianqian.zyf <tianqian.zyf@alibaba-inc.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants