Skip to content

xds: share rds configuration among envoy 2/2#9209

Merged
htuch merged 10 commits intoenvoyproxy:masterfrom
lambdai:newsharedrds
Dec 13, 2019
Merged

xds: share rds configuration among envoy 2/2#9209
htuch merged 10 commits intoenvoyproxy:masterfrom
lambdai:newsharedrds

Conversation

@lambdai
Copy link
Copy Markdown
Contributor

@lambdai lambdai commented Dec 3, 2019

Full context is here
The above PR is split into #9008 and this one.

In this PR envoy will maintain only one copy of rds config for each resource name.
Also fixed a the bug of early listener notified ready describe in #8781
The test case is included in this PR.

Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Copy link
Copy Markdown
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I think it would make sense to have @stevenzzzz also take a pass, thanks.
/wait

std::shared_ptr<RdsRouteConfigProviderImpl> new_provider{
new RdsRouteConfigProviderImpl(std::move(subscription), factory_context)};
dynamic_route_config_providers_.insert(
{manager_identifier, std::weak_ptr<RdsRouteConfigProviderImpl>(new_provider)});
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have to hold a strong reference to ensure this doesn't destruct on one of the worker threads?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Subscription holds a strong reference of Provider here:

subscription_->routeConfigProviders().insert(this);

And map look up will only see two possibility:

  1. no such key in map, no matter worker threads hold a strong reference or not.
  2. map has such key. It means the subscription is not destructed. That subscription hold a strong ref to the provider.

Both possibilities are under the assumption that
subscription exist iff key exists in this dynamic_route_config_providers_ map
I think this assumption is correct because both the CRUD on map and construct/destruct on subscription occurs on main thread.
See Xin's comments 4 lines below.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My above comment could be wrong. I will add some ASSERT

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might worth check if the proviider is destructed in main thread instead, with my most recent change to thread_local_impl, a SlotImpl owner is free to be destructed on any thread.

Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
@lambdai
Copy link
Copy Markdown
Contributor Author

lambdai commented Dec 5, 2019

I don't have a complete sense of the life time of RdsRouteConfigProviderImpl
But I think this PR is not breaking anything because

  1. This PR unifies the ownership of RdsRouteConfigProvider. Previously a HCM could own a RdsRouteConfigProvider, or own srds config provider which owns shared RdsRouteConfigProvider. In any scenario RdsRouteConfigProvider outlives than the corresponding RdsRouteConfigProvider before this PR
  2. Is the extended life time introduce new issue?
    I think the answer is NO. There is no obvious cycle strong reference.
    Directly respond to your concerning scenario, let's reason about it in this way: if the shared RdsRouteConfigProvider in this PR could be referenced, I can figure a sequence that RdsRouteConfigProvider, before this PR, is not destructed and is referenced.

Does above make sense?

Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Copy link
Copy Markdown
Contributor

@stevenzzzz stevenzzzz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
As we've discussed, migrating RDS to the new config-provide-framework would revert most of your work here tho.

EXPECT_TRUE(provider2->configInfo().has_value());

EXPECT_EQ(provider_.get(), provider2.get())
<< "fail to obtain the same rds config provider object";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this for debugging?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The .get() part or the message?
The former: Yes, that's verbose. Will fix.
The latter: goal is to provide more details to debugging a test failure in the future.

std::shared_ptr<RdsRouteConfigProviderImpl> new_provider{
new RdsRouteConfigProviderImpl(std::move(subscription), factory_context)};
dynamic_route_config_providers_.insert(
{manager_identifier, std::weak_ptr<RdsRouteConfigProviderImpl>(new_provider)});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might worth check if the proviider is destructed in main thread instead, with my most recent change to thread_local_impl, a SlotImpl owner is free to be destructed on any thread.

Copy link
Copy Markdown
Contributor Author

@lambdai lambdai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I will address the small fixes.
And leave the provider frame work in the following PRs

EXPECT_TRUE(provider2->configInfo().has_value());

EXPECT_EQ(provider_.get(), provider2.get())
<< "fail to obtain the same rds config provider object";
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The .get() part or the message?
The former: Yes, that's verbose. Will fix.
The latter: goal is to provide more details to debugging a test failure in the future.

Signed-off-by: Yuchen Dai <silentdai@gmail.com>
@lambdai
Copy link
Copy Markdown
Contributor Author

lambdai commented Dec 6, 2019

@htuch @stevenzzzz
I think I find a hanging bug related to VHDS #9254
I am not going to fix it in this PR.

@htuch htuch added the waiting label Dec 10, 2019
@lambdai
Copy link
Copy Markdown
Contributor Author

lambdai commented Dec 13, 2019

@stevenzzzz and i don't have further action on this PR.
@htuch Is there any concern from your side?

Copy link
Copy Markdown
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@htuch htuch merged commit f2e7c97 into envoyproxy:master Dec 13, 2019
prakhag1 pushed a commit to prakhag1/envoy that referenced this pull request Jan 3, 2020
Full context is here
The above PR is split into envoyproxy#9008 and this one.

In this PR envoy will maintain only one copy of rds config for each resource name.
Also fixed a the bug of early listener notified ready describe in envoyproxy#8781
The test case is included in this PR.

Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Signed-off-by: Prakhar <prakhar_au@yahoo.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants