Skip to content

[23088] Solve Discovery Server race conditions (backport #5780)#5807

Merged
MiguelCompany merged 5 commits into2.14.xfrom
mergify/bp/2.14.x/pr-5780
Jul 29, 2025
Merged

[23088] Solve Discovery Server race conditions (backport #5780)#5807
MiguelCompany merged 5 commits into2.14.xfrom
mergify/bp/2.14.x/pr-5780

Conversation

@mergify
Copy link
Copy Markdown
Contributor

@mergify mergify bot commented May 12, 2025

Description

This PR attempts to fix a couple of race conditions detected in scenarios where a large number of clients are present:

  • PDP and EDP messages are stored in two queues which are not atomically swapped, which could create situations such as processing a data r/w of a data P already received but to be processed in the next server routine iteration.
  • When a data UP is received (or manually inserted if a participant is dropped), it remains in the internal participants map but with its change updated. The item will be deleted only after the data UP is acked by all participants. If a new data P is received before this occurs, it will not be correctly processed. As a result, when the data UP is acked the item will be deleted from the participants map, and if a data r/w was to be processed afterwards, this would fail and print one of these errors "Reader/Writer has no associated participant. Skipping" or "Matching unexisting participant from reader/writer".
  • When a reader/writer is being inserted to the database, checking if the corresponding participant is present is done after insertion, instead of as a first step. And even if the participant is present, it should be checked if it is alive to abort otherwise.

@Mergifyio backport 3.1.x 2.14.x 2.10.x

Contributor Checklist

  • Commit messages follow the project guidelines.

  • The code follows the style guidelines of this project.

  • Tests that thoroughly check the new feature have been added/Regression tests checking the bug and its fix have been added; the added tests pass locally

  • N/A Any new/modified methods have been properly documented using Doxygen.

  • N/A Any new configuration API has an equivalent XML API (with the corresponding XSD extension)

  • Changes are backport compatible: they do NOT break ABI nor change library core behavior.

  • Changes are API compatible.

  • N/A New feature has been added to the versions.md file (if applicable).

  • N/A New feature has been documented/Current behavior is correctly described in the documentation.

  • Applicable backports have been included in the description.

Reviewer Checklist

  • The PR has a milestone assigned.
  • The title and description correctly express the PR's purpose.
  • Check contributor checklist is correct.
  • N/A If this is a critical bug fix, backports to the critical-only supported branches have been requested.
  • Check CI results: changes do not issue any warning.
  • Check CI results: failing tests are unrelated with the changes.

This is an automatic backport of pull request #5780 done by [Mergify](https://mergify.com).

@mergify
Copy link
Copy Markdown
Contributor Author

mergify bot commented May 12, 2025

Cherry-pick of ec666f7 has failed:

On branch mergify/bp/2.14.x/pr-5780
Your branch is up to date with 'origin/2.14.x'.

You are currently cherry-picking commit ec666f72.
  (fix conflicts and run "git cherry-pick --continue")
  (use "git cherry-pick --skip" to skip this patch)
  (use "git cherry-pick --abort" to cancel the cherry-pick operation)

Changes to be committed:
	modified:   src/cpp/rtps/builtin/discovery/database/DiscoveryDataBase.hpp
	modified:   src/cpp/rtps/builtin/discovery/participant/PDPServer.cpp

Unmerged paths:
  (use "git add <file>..." to mark resolution)
	both modified:   src/cpp/rtps/builtin/discovery/database/DiscoveryDataBase.cpp
	both modified:   test/blackbox/common/BlackboxTestsDiscovery.cpp

To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally

@mergify mergify bot added the conflicts Backport PR wich git cherry pick failed label May 12, 2025
@Mario-DL Mario-DL added this to the v2.14.5 milestone May 14, 2025
@cferreiragonz
Copy link
Copy Markdown
Contributor

cferreiragonz
cferreiragonz previously approved these changes Jul 24, 2025
@cferreiragonz
Copy link
Copy Markdown
Contributor

I backported the core_types from master in order to be able to filter out acknacks msgs. Uncrustify failing due to generated code

@cferreiragonz cferreiragonz added ci-pending PR which CI is running and removed conflicts Backport PR wich git cherry pick failed needs rebase labels Jul 24, 2025
@cferreiragonz cferreiragonz requested review from richiprosima and removed request for richiprosima July 24, 2025 13:53
@cferreiragonz cferreiragonz force-pushed the mergify/bp/2.14.x/pr-5780 branch from 3112f77 to 60f4f42 Compare July 28, 2025 08:02
juanlofer-eprosima and others added 5 commits July 29, 2025 07:32
* Refs #23088: Test reconnection when removing participant

Signed-off-by: cferreiragonz <carlosferreira@eprosima.com>

* Refs #23088: Solve EDP-PDP queues race condition

Signed-off-by: Juan Lopez Fernandez <juanlopez@eprosima.com>

* Refs #23088: Solve data UP + data P race condition

Signed-off-by: Juan Lopez Fernandez <juanlopez@eprosima.com>

* Refs #23088: Abort writer/reader processing if associated participant not alive

Signed-off-by: Juan Lopez Fernandez <juanlopez@eprosima.com>

* Refs #23088: Apply suggestions

Signed-off-by: Juan Lopez Fernandez <juanlopez@eprosima.com>

* Refs #23088: Release change when writer/reader insertion in DB failed

Signed-off-by: Juan Lopez Fernandez <juanlopez@eprosima.com>

* Refs #23088: Match servers after change update

Signed-off-by: Juan Lopez Fernandez <juanlopez@eprosima.com>

---------

Signed-off-by: cferreiragonz <carlosferreira@eprosima.com>
Signed-off-by: Juan Lopez Fernandez <juanlopez@eprosima.com>
Co-authored-by: cferreiragonz <carlosferreira@eprosima.com>
(cherry picked from commit ec666f7)

# Conflicts:
#	src/cpp/rtps/builtin/discovery/database/DiscoveryDataBase.cpp
#	test/blackbox/common/BlackboxTestsDiscovery.cpp
Signed-off-by: cferreiragonz <carlosferreira@eprosima.com>
Signed-off-by: cferreiragonz <carlosferreira@eprosima.com>
Signed-off-by: cferreiragonz <carlosferreira@eprosima.com>
Signed-off-by: cferreiragonz <carlosferreira@eprosima.com>
@cferreiragonz cferreiragonz force-pushed the mergify/bp/2.14.x/pr-5780 branch from 60f4f42 to eaa6b1d Compare July 29, 2025 05:32
@cferreiragonz cferreiragonz self-requested a review July 29, 2025 05:32
@cferreiragonz cferreiragonz added ready-to-merge Ready to be merged. CI and changes have been reviewed and approved. and removed ci-pending PR which CI is running labels Jul 29, 2025
@MiguelCompany MiguelCompany merged commit 1e881c6 into 2.14.x Jul 29, 2025
16 of 19 checks passed
@MiguelCompany MiguelCompany deleted the mergify/bp/2.14.x/pr-5780 branch July 29, 2025 12:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-to-merge Ready to be merged. CI and changes have been reviewed and approved.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants