Skip to content

HTLC restoration#1255

Merged
Roasbeef merged 5 commits intolightningnetwork:masterfrom
halseth:htlc-mismatch
May 18, 2018
Merged

HTLC restoration#1255
Roasbeef merged 5 commits intolightningnetwork:masterfrom
halseth:htlc-mismatch

Conversation

@halseth
Copy link
Contributor

@halseth halseth commented May 16, 2018

This PR attempts to fix a lingering bug within channel.go where the log counters would get out of sync with the HTLCs in the index after a restore.

Additionally, we remove some redundant restoration code to simplify the logic.

Fixes #1247
(potentially also fixes #1124).

halseth added 5 commits May 16, 2018 20:53
remoteUpdateLog from localCommit

This commit fixes a bug within channel.go that would lead to the
content of the update logs and their indexes getting out of sync during
restores.

The scenario that could occur was that the localUpdateLog was initiated
with a log index taken from the localCommitment. Updates we send (which
are added to the localUpdateLog) will be added to the remote commitment
first. The problem happened when an update was sent and added to the
remote commitment, but not ACKed. Since it was not ACKed, we would not
add it to our local commitment. During a restart/restore we would init
the localUpdateLog with a height too low, such that when going through
the outgoing HTLCs on the remote commitment, we would restore an HTLC at
an index higher than our local log HTLC counter.

The symmetric change is done to the remoteUpdateLog.
This commit adds a test ensuring that the fix applied in the previous
commit works as expected. The test exercises the scenario where the
HTLCs on the local, remote and pending remote commitment differ, and we
attempt to restore the update logs. We now check that in this case the
logs before and after restart are equivalent.
This commit adds a test that runs through a scenario where an HTLC is
added then failed, making sure the update logs are properly restored at
any point during the process.
This commit removes the stage during updateLog restoration where we
would attempt to restore incoming HLTCs from the pendingRemoteCommit, in
addition to update our log and htlc counter to reflect this state. The
reason we can safely remove this is to observe that a pending remote
commit is always created from a commitDiff which only contains updates
made by _us_, and thus only taken from the localUpdateLog. The same can
be said for the counters, when creating a commitDiff we'll always use
the remoteACKedIndex as the index into the remoteUpdateLog, meaning that
all potential updates will already be included in the remote commit that
has been ACKed.
This commit removes redundant HTLC restoring. We don't have to restore
outgoing HTLCs from the local commitment, as we _know_ they will always
be added to the remote commitment first. Also, when receiving
Settles/Fails, they will be removed from the local commitment first.
This way we can be sure that outgoing HTLCs found on the local
commitment always will be found on the remote commitment

Similarly we don't have to restore incoming HTLCs from the remote
commitment, as they will be added to the local commitment first.
@Roasbeef Roasbeef added this to the 0.4.2-beta milestone May 16, 2018
Copy link
Member

@Roasbeef Roasbeef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent work! I think this may resolve of number of issues we've been seeing on startup within the state machine, which may have spilled over into the switch logic as well, due to us attempting to re-forward duplicate HTLCs.

The diff looks good to me, but I'm going to run it on a few of my nodes for a bit to see if anything comes up. AFAICT, the diff is very simpler and concise though.

// from the local and remote commitments.
localUpdateLog := newUpdateLog(
localCommit.LocalLogIndex, localCommit.LocalHtlcIndex,
remoteCommit.LocalLogIndex, remoteCommit.LocalHtlcIndex,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I see what you're getting at, but shouldn't this be the remote pending commitent? Our pending changes will be committed to that commit diff, not this "locked-in" commitment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the remote commit should be fine, as all updates on the pending commit will be appended, not restored, and therefore increment the counters.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep

// contain no HTLCs, her remote commitment will contain an HTLC with
// index 0, and the pending remote commitment (a signed remote
// commitment which is not AKCed yet) will contain an additional HTLC
// with index 1.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh ok, it's clear now: the case where Alice sends another sig before Bob sends a sig to sync up her local commit. In this case, Alice's latest log counter is actually what's on the remote commit.

Need to think a bit w.r.t if there are cases where this is actually incorrect...🤔

// update logc based on if it's incoming vs outgoing. For any incoming
// HTLC's, we also re-add it to the rHashMap so we can quickly look it
// up.
// For each incoming HTLC within the local commitment, we add it to the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, nice simplification 👍

This may also have lend to redundant HTLCs being added to the log, which may cause issues when we go to decide which HTLCs are locked in and ready for forwarding.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the code I removed would be just noops in all cases. I tried coming up with situations where they would actually add something, but was convinced that they will never by the explanation in the commit message. I might be wrong, but the commit is made by the assumption that it doesn't change behaviour.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thought about this a bit over the past few days, and my current conclusion is that you are indeed correct. Also that these should have technically been no-ops, but could have times caused redundant elements in the linked list.

return nil
}

// If we do have a dangling commitment for the remote party, then we'll
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The
reason we can safely remove this is to observe that a pending remote
commit is always created from a commitDiff which only contains updates
made by us, and thus only taken from the localUpdateLog

I'm not sure that this is always the case, as it's possible to have a situation wherein we commit in coming HTLCs for them for the first time (as a side effect of our commitment).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although, if we're committing these HTLCs to them for the first time, then it should already be in the remote log, as on start up we'll populate the remote log with incoming HTLCs on our commitment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, remote updates will be applied to our commit first, which will be used to populate the index. This commit should also not change behaviour.

Copy link
Member

@Roasbeef Roasbeef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🎉

@Roasbeef Roasbeef merged commit 6a2a049 into lightningnetwork:master May 18, 2018
@halseth halseth deleted the htlc-mismatch branch July 12, 2018 13:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants