Improve the reliability of test_get_type_description_service.#1107
Merged
clalancette merged 1 commit intorollingfrom Oct 4, 2023
Merged
Conversation
We were seeing occasional failures of test_get_type_description_service on all platforms. It turns out that the removal of a service from the graph cache is an asynchronous operation. In particular, all of our current RMW implementations publish a graph update to the graph topic, and only remove things from the graph once that data has been delivered. This can potentially happen in another thread. That means that immediately after service_fini(), a call to get_service_names_and_types() may actually still have the old service name in it. Since our get_type_description tests were relying on this to go away, this was causing it to be flakey. We fix this here by adding in a new helper function, service_not_exists(). This helper is not just the inverse of service_exists() (which returns immediately when it finds the service in the graph cache). Instead, service_not_exists() waits until the service has *left* the cache before returning (or times out). In my testing, this fixes the flakiness locally. Signed-off-by: Chris Lalancette <clalancette@gmail.com>
Contributor
Author
|
This would close #1108 |
fujitatomoya
approved these changes
Oct 4, 2023
| } | ||
|
|
||
| std::this_thread::sleep_for(std::chrono::milliseconds(100)); | ||
| std::this_thread::sleep_for(std::chrono::milliseconds(10)); |
Collaborator
There was a problem hiding this comment.
(nice to have) adding argument std::chrono::milliseconds period would be more flexible.
Contributor
Author
There was a problem hiding this comment.
(nice to have) adding argument
std::chrono::milliseconds periodwould be more flexible.
Yeah, that's true. Given that this is only used in this test, it would be easy enough to add later if we need the flexibility. So I'm going to not do that for now, but thanks for the idea.
clalancette
added a commit
that referenced
this pull request
Jun 25, 2024
In particular, make sure to match all service_description_init with the appropriate finis. While we are in here, also add in a "service_not_exists" function that will quit much earlier, speeding up the test. This is essentially a backport of #1107 and parts of #1112, but adapted for Iron. Signed-off-by: Chris Lalancette <clalancette@gmail.com>
clalancette
added a commit
that referenced
this pull request
Jun 25, 2024
In particular, make sure to match all service_description_init with the appropriate finis. While we are in here, also add in a "service_not_exists" function that will quit much earlier, speeding up the test. This is essentially a backport of #1107 and parts of #1112, but adapted for Iron. Signed-off-by: Chris Lalancette <clalancette@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
We were seeing occasional failures of test_get_type_description_service on all platforms.
It turns out that the removal of a service from the graph cache is an asynchronous operation. In particular, all of our current RMW implementations publish a graph update to the graph topic, and only remove things from the graph once that data has been delivered. This can potentially happen in another thread.
That means that immediately after service_fini(), a call to get_service_names_and_types() may actually still have the old service name in it. Since our get_type_description tests were relying on this to go away, this was causing it to be flakey.
We fix this here by adding in a new helper function, service_not_exists(). This helper is not just the inverse of service_exists() (which returns immediately when it finds the service in the graph cache). Instead, service_not_exists() waits until the service has left the cache before returning (or times out).
In my testing, this fixes the flakiness locally.