ReaderProxy history reloaded incorrectly missing GAP message [4962]#453
ReaderProxy history reloaded incorrectly missing GAP message [4962]#453guillaumeautran wants to merge 1 commit intoeProsima:developfrom
Conversation
|
Looks like the build failure on Linux is due to a infrastructure problem Not really sure why the MAC build failed though. The test results is unclear to me... |
src/cpp/rtps/writer/ReaderProxy.cpp
Outdated
There was a problem hiding this comment.
These lines were removed on PR #425, as the error message could incorrectly be shown on correct situations after the changes on PR #411. There could be a situation were we receive an acknack message requesting a single change that is not considered unacknowledged (may be underway), in which case the logError will be incorrectly stated.
There was a problem hiding this comment.
Yes, I noticed this after the fact. Is there a easy way to differentiate those two cases? (when the missing change is legitimate vs. when it is not?).
There was a problem hiding this comment.
The only non-legitimate situation I can think of is if the acknack requests a change that has been already acknowledged, in which case its sequence number would be lower or equal to the low_mark. We could add some code before line 172. Something like:
if(*sit <= changesFromRLowMark_)
{
logError(RTPS_WRITER,"Requested already acknowledged change: " << *sit
<< " (low mark: " << changesFromRLowMark_ << ")");
}
else if(chit != m_changesForReader.end() && UNACKNOWLEDGED == chit->getStatus())
We could even just perform an assert(*sit > changesFromRLowMark_ ) before line 170.
There was a problem hiding this comment.
I'm not going to perform the assert because I know there are cases where the requested change is above the low_mark. But I will most definitely silence the log is the requested changes are below the low_mark.
[2019-04-01 15:23:48.632 UTC] [ERROR] [/bridge]: [fastrtps.RTPS_WRITER] Requested Changes: 14 15 16 17 18 30 31 32 33 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 not found (low mark: 13)
There was a problem hiding this comment.
The trace shown may just mean those changes are either already marked as requested (i.e. the same acknack was received twice), or considered underway (i.e. recently sent). Both are legal situations. That is the reason the message was originally silenced.
There was a problem hiding this comment.
Ok, so I've modified the code to record those sequence numbers greater than the low mark but not found in the reader history. These are most definitely an error that should not happen. Let me know if that's good now.
There was a problem hiding this comment.
but not found in the reader history
I guess you mean writer history, as the ReaderProxy is part of the writer.
Even though, this will still log errors in legal situations. The changes may have been removed from the writer's history (due to history limits, removed by the user, ...). In that case, after receiving the acknack, a gap message will be sent informing the reader of the missing changes.
There was a problem hiding this comment.
Right, the reader proxy (aka the writer history).
I am still sceptical wrt the GAP message being sent in the cases where there is a GAP in the writer history without a corresponding irrelevant change recorded in the reader proxy.
At least in 1.7.2 release, I do not see this happening. Can you please point me to the code where this is taken care of (ie: a ACKNACK message received generating a GAP message for missing change seqnum > low mark)? IMHO, if that was the case, this whole fix would be mute since the whole problem this fix addresses is the fact that a new reader proxy being created did not get initialized with the missing (irrelevant) change records indicating the gap at the tail end of the Writer History.
There was a problem hiding this comment.
I've taken a deeper look at this and I've got to the conclusion that you're right. The changes may not be on the writer's history but should be on the ReaderProxy collection. We may refactor that in the future, but right now we're good to go with your changes.
3954090 to
e7ce157
Compare
abdb99f to
f552a76
Compare
|
@MiguelCompany : Just rebased against latest develop. Feel free to merge anytime. |
The problem is invalid changes at the end of the stateful writer history don't get added to a newly connected reader proxy cache causing the reader proxy to miss a lot of invalid changed and not generate the proper GAP message. The fix is to make sure the missing changes are loaded into the reader proxy history so that the GAP message is properly generated. Reworked the error message on missing changes to eliminate legitimate cases (UNACK'ed changed or duplicated ACK request). The ERROR will only show if the change was not found and the change was not ACK'ed previously. Remove meaningless assert statement. issue eProsima#452
f552a76 to
a19016e
Compare
|
@MiguelCompany Any help in understanding why the MAC tests are not passing? |
|
ReaderProxy was refactored and changes are now removed from ReaderProxy when they are removed from writer's history. I've found that irrelevant changes are added to the collection of the ReaderProxy and they shouldn't. GAP messages are sent taking into consideration history's So to take already removed changes into account when a new reader matches, and based on your changes, I propose the following patch: |
|
Sounds good @MiguelCompany |
|
Closing in favor of #502 |
The problem is invalid changes at the end of the stateful writer history don't get added to a newly connected reader proxy cache causing the reader proxy to miss a lot of invalid changed and not generate the proper GAP message.
The fix is to make sure the missing changes are loaded into the reader proxy history so that the GAP message is properly generated.
should fix #452