Skip to content

Payment lifecycle refactoring#1130

Merged
t-bast merged 14 commits intomasterfrom
payment-lifecycle-without-events
Sep 20, 2019
Merged

Payment lifecycle refactoring#1130
t-bast merged 14 commits intomasterfrom
payment-lifecycle-without-events

Conversation

@t-bast
Copy link
Member

@t-bast t-bast commented Sep 6, 2019

  • Unify payment events (we really didn't need two separate type hierarchies, it became messy once PaymentResult started being sent to the eventStream)
  • Move interactions with the DB and eventStream out of PaymentLifecycle: this paves the way for payments-that-are-not-really-payments for which we don't want to store in the DB and don't want to emit events (sub-payments in AMP or relayed trampoline payments for example)
  • Remove information from AuditDB:
    • channel id won't make sense anymore in an AMP world and wasn't used
  • Add more information to the PaymentDB:
    • external id (to help lightning apps reconcile with their own DB)
    • parent id (AMP)
    • target node ID
    • bolt 11 invoice
    • fees paid
    • payment route in case of success
    • payment failures
  • Both Payment DB and Audit DB have a new version and migration

In particular, this supercedes #1048, #1070, #1074 and #1099

Here is what a payment success looks like:
payment_success

And here is a payment failure:
payment_failure

@t-bast t-bast requested a review from pm47 September 6, 2019 14:22
@codecov-io
Copy link

Codecov Report

Merging #1130 into master will increase coverage by 0.25%.
The diff coverage is 96.59%.

@@            Coverage Diff             @@
##           master    #1130      +/-   ##
==========================================
+ Coverage   83.63%   83.89%   +0.25%     
==========================================
  Files         103      104       +1     
  Lines        7743     7784      +41     
  Branches      315      323       +8     
==========================================
+ Hits         6476     6530      +54     
+ Misses       1267     1254      -13
Impacted Files Coverage Δ
...main/scala/fr/acinq/eclair/payment/Autoprobe.scala 0% <ø> (ø) ⬆️
...c/main/scala/fr/acinq/eclair/payment/Relayer.scala 89.04% <0%> (+0.6%) ⬆️
...cala/fr/acinq/eclair/db/sqlite/SqliteAuditDb.scala 98.94% <100%> (+0.28%) ⬆️
.../fr/acinq/eclair/payment/LocalPaymentHandler.scala 100% <100%> (ø) ⬆️
...ala/fr/acinq/eclair/payment/PaymentInitiator.scala 100% <100%> (ø) ⬆️
...c/main/scala/fr/acinq/eclair/payment/Auditor.scala 92.64% <100%> (+0.1%) ⬆️
.../scala/fr/acinq/eclair/payment/PaymentEvents.scala 100% <100%> (ø)
.../scala/fr/acinq/eclair/db/sqlite/SqliteUtils.scala 100% <100%> (ø) ⬆️
...ala/fr/acinq/eclair/payment/PaymentLifecycle.scala 88.99% <94.44%> (+0.29%) ⬆️
...src/main/scala/fr/acinq/eclair/router/Router.scala 91.84% <0%> (-0.55%) ⬇️
... and 5 more

@t-bast
Copy link
Member Author

t-bast commented Sep 6, 2019

After offline discussion, I think I'll make the following changes:

  • remove payment failed events from audit DB (or should I keep them?)
  • add payment failures to the payment DB
  • convert Hop to a simpler HopSummary(nodeId, nextNodeId, channelId) (we don't need all the channel updates in the DB when we're only interested in logging the route when a payment succeeded) -> do it in a cleaner way than currently
  • do the same trick for failures (convert to a lighter FailureSummary to be stored in DB)

Since DB changes require migrations and careful (painful?) testing, are we ok with that plan?

@araspitzu
Copy link
Contributor

Nice changes, especially storing the route for a successful payment (can be used later for path finding heuristics). On the payment events: is it possible to unify the events extending fr.acinq.eclair.payment.PaymentEvent with fr.acinq.eclair.db.IncomingPayment and fr.acinq.eclair.db.OutgoingPayment? They are used in different DBs but in fact contain almost exactly the same data and it feels a bit redundant.

Unify payment events.
Factorize DB and eventStream interactions: this paves the way for sub-payments that shouldn't be stored in the DB nor emit events.
ChannelId is removed (won't make sense for AMP).
Fixed typo in ChannelErrorOccurred.
* bolt 11 invoice
* external id
* parent id (AMP)
* target node id
* fees
* route
* failures
@t-bast t-bast force-pushed the payment-lifecycle-without-events branch from 74bc61a to 7f728e1 Compare September 17, 2019 12:31
@t-bast t-bast requested review from dpad85 and pm47 September 17, 2019 12:32
@dpad85
Copy link
Member

dpad85 commented Sep 17, 2019

I'd like to use this PR to add a method in PaymentDB that retrieve a list of incoming+outgoing payments in a single call. This method would query the sent_payments and received_payments DB and build a list of objects designed to be as light as possible for good performance (thinking of mobile app here).

Payment(id: Either[UUID, ByteVector32], 
               amount: Option[MilliSatoshi],
               paymentRequest: Option[String],
               status: Either[OutgoingPaymentStatus, IncomingPaymentStatus],
               completedAt: Option[Long])

Note that paymentRequest is a String because deserializing a PaymentRequest is costly especially if the list is large.

@t-bast
Copy link
Member Author

t-bast commented Sep 18, 2019

I'd like to use this PR to add a method in PaymentDB that retrieve a list of incoming+outgoing payments in a single call.

That sounds reasonable, I'll add that too.
Many changes to the PaymentsDb interface coming in the next commits ;)

// updates the status of the payment, if the newStatus is SUCCEEDED you must supply a preimage
def updateOutgoingPayment(id: UUID, newStatus: OutgoingPaymentStatus.Value, preimage: Option[ByteVector32] = None)
/** Update the status of the payment in case of success. */
def updateOutgoingPayment(paymentResult: PaymentSent)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: could have been an Either[PaymentFailed, PaymentSent]. Not sure what's best

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a strong opinion, but I slightly prefer overloading in that case (it's easier for the caller).

…ructures.

Clarify use of seconds / milliseconds -> we use milliseconds everywhere except at the Eclair API level
(probably because it's easier from bash to get a unix timestamp in seconds than in milliseconds).
…ment structures. Clarify use of seconds / milliseconds -> we use milliseconds everywhere except at the Eclair API level (probably because it's easier from bash to get a unix timestamp in seconds than in milliseconds).
@t-bast t-bast requested review from dpad85, pm47 and sstone September 19, 2019 07:39
@t-bast
Copy link
Member Author

t-bast commented Sep 19, 2019

The latest commits should resolve all the pending comments (please mark them resolved if everything looks good). I chose a slightly different naming which I think makes sense.

New things that were added and not necessarily discussed:

  • Harmonized Long to always be milliseconds (even for the from and to fields used in listXXX functions)
  • Made payment expiry explicitly not null -> the spec says in the absence of an explicit expiry, we use 1 hour. That means the expire_at column in the PaymentsDb should always be set to a non-null values to avoid discrepancies between that column and calling paymentRequest.isExpired
  • Run migrations/init inside SQL transactions: there was an issue with our previous transaction code, it didn't commit the changes nor rollback...that made it look like it worked but only because we're using a single connection to the DB and it was memory-inefficient (intermediate results were kept locally in the connection instead of committed to the DB)

I'm investigating changing the PaymentSent and PaymentReceived events to each contain a list of parts. This way the AuditDB can store each sub-payment and keep information about the channelId.

dpad85
dpad85 previously approved these changes Sep 19, 2019
This effectively reverts a previous commit and removes the need for a DB migration.
Payments are now always represented as a list of partial payments.
…effectively reverts a previous commit and removes the need for a DB migration. Payments are now always represented as a list of partial payments.
}

case class PaymentSent(id: UUID, paymentHash: ByteVector32, paymentPreimage: ByteVector32, parts: Seq[PaymentSent.PartialPayment]) extends PaymentEvent {
require(parts.nonEmpty, "sent payment is empty")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
require(parts.nonEmpty, "sent payment is empty")
require(parts.nonEmpty, "must have at least one subpayment")


case class PaymentReceived(amount: MilliSatoshi, paymentHash: ByteVector32, fromChannelId: ByteVector32, timestamp: Long = Platform.currentTime) extends PaymentEvent
case class PaymentReceived(paymentHash: ByteVector32, parts: Seq[PaymentReceived.PartialPayment]) extends PaymentEvent {
require(parts.nonEmpty, "received payment is empty")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
require(parts.nonEmpty, "received payment is empty")
require(parts.nonEmpty, "must have at least one subpayment")

require(parts.nonEmpty, "sent payment is empty")
val amount: MilliSatoshi = parts.map(_.amount).sum
val feesPaid: MilliSatoshi = parts.map(_.feesPaid).sum
val timestamp: Long = parts.map(_.timestamp).min
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment explaining the rationale behind using min would be useful.

case class PaymentReceived(paymentHash: ByteVector32, parts: Seq[PaymentReceived.PartialPayment]) extends PaymentEvent {
require(parts.nonEmpty, "received payment is empty")
val amount: MilliSatoshi = parts.map(_.amount).sum
val timestamp: Long = parts.map(_.timestamp).max
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment explaining the rationale behind using max would be useful.

case class PaymentSent(id: UUID, amount: MilliSatoshi, feesPaid: MilliSatoshi, paymentHash: ByteVector32, paymentPreimage: ByteVector32, toChannelId: ByteVector32, timestamp: Long = Platform.currentTime) extends PaymentEvent
object PaymentSent {

case class PartialPayment(id: UUID, amount: MilliSatoshi, feesPaid: MilliSatoshi, toChannelId: ByteVector32, route: Seq[Hop], timestamp: Long = Platform.currentTime)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You seem to allow not providing a route. I think it would be cleaner to use an option instead of a zero-size sequence.

Suggested change
case class PartialPayment(id: UUID, amount: MilliSatoshi, feesPaid: MilliSatoshi, toChannelId: ByteVector32, route: Seq[Hop], timestamp: Long = Platform.currentTime)
case class PartialPayment(id: UUID, amount: MilliSatoshi, feesPaid: MilliSatoshi, toChannelId: ByteVector32, route: Option[Seq[Hop]], timestamp: Long = Platform.currentTime)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually you really need to provide a route all the time so I like having directly a Seq.
You're referring to the hackish part in the AuditDB where we set this to Nil because it's not really relevant for Audit right? I wanted to avoid creating yet another type, but I think putting an option there only because of this small AuditDB hack would be too harmful for the normal code...

Copy link
Member

@pm47 pm47 Sep 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I was referring to the relayer

statement.setBytes(2, e.paymentHash.toArray)
statement.setBytes(3, p.fromChannelId.toArray)
statement.setLong(4, p.timestamp)
statement.addBatch()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addBatch ✌️

@t-bast t-bast merged commit 401c996 into master Sep 20, 2019
@t-bast t-bast deleted the payment-lifecycle-without-events branch September 20, 2019 12:29
araspitzu added a commit that referenced this pull request Sep 24, 2019
araspitzu added a commit that referenced this pull request Sep 24, 2019
sstone added a commit that referenced this pull request Oct 8, 2019
* Update list of commands in eclair-cli help (#1091)

* Add missing API endpoints to eclair-cli help

* Documentation update (#1092)

* Typed amounts (#1088)

* Route computation: fix fee check (#1101)

Fee check during route computation is:
- fee is below maximum value
- OR fee is below amout * maximum percentage

The second check was buggy and route computation would failed when fees we above maximum value but below maximum percentage of amount being paid.

* Publish transactions during transitions (#1089)

Follow up to #1082.

The goal is to be able to publish transactions only after we have
persisted the state. Otherwise we may run into corner cases like [1]
where a refund tx has been published, but we haven't kept track of it
and generate a different one (with different fees) the next time.

As a side effect, we can now remove the special case that we were
doing when publishing the funding tx, and remove the `store` function.

NB: the new `calling` transition method isn't restricted to publishing
transactions but that is the only use case for now.

[1] ACINQ/eclair-mobile#206

* Typed cltv expiry (#1104)

Untyped cltv expiry was confusing: delta and absolute expiries really need to be handled differently.
Even variable names were sometimes misleading.
Now the compiler will help us catch errors early.

* Extended queries optional (#899)

This is the implementation of lightning/bolts#557.

* Correctly handle multiple channel_range_replies

The scheme we use to keep tracks of channel queries with each peer would forget about
missing data when several channel_range_replies are sent back for a single channel_range_queries.

* RoutingSync: remove peer entry properly

* Remove peer entry on our sync map only when we've received
a `reply_short_channel_ids_end` message.
* Make routing sync test more explicit

* Do not send channel queries if we don't want to sync

* Router: clean our sync state when we (re)connect to a peer

We must clean up leftovers for the previous session and start the sync process again.

* Router: reset sync state on reconnection

When we're reconnected to a peer we will start a new sync process and should reset our sync
state with that peer.

* Extended Queries: use TLV format for optional data

Optional query extensions now use TLV instead of a custom format.
Flags are encoded as varint instead of bytes as originally proposed. With the current proposal they will all fit on a single byte, but will be
much easier to extends this way.

* TLV Stream: Implement a generic "get" method for TLV fields

If a have a TLV stream of type MyTLV which is a subtype of TLV, and MyTLV1 and MYTLV2 are both
subtypes of MyTLV then we can use stream.get[MyTLV1] to get the TLV record of type MYTLV1 (if any)
in our TLV stream.

* Channel range queries: send back node announcements if requested (#1108)

This PR adds support for sending back node announcements when replying to channel range queries:
- when explicitly requested (bit is set in the optional query flag)
- when query flags are not used and a channel announcement is sent (as per the BOLTs)

A new configuration option `request-node-announcements` has been added in the `router` section. If set to true, we
will request node announcements when we receive a channel id (through channel range queries) that we don't know of.
This is a setting that we will probably turn off on mobile devices.

* Rework router data structures (#902)

Instead of using two separate maps (for channels and channel_updates), we now use a single map, which groups channel+channel_updates. This is also true for data storage, resulting in the removal of the channel_updates table.

* Add more numeric utilities to MilliSatoshi (#1103)

Add comparisons and postfix operators.
Update most of the codebase to leverage those.

* Use unsigned comparison for 'maxHtlcValueInFlightMsat' (#1105)

* Add a sync whitelist (#954)

We will only sync with whilelisted peer. If the whitelist is empty then
we sync with everyone.

* Move http APIs to subproject eclair-node (#1102)

* Fix regression in `Commitments.availableForSend` (#1107)

We must consider `nextRemoteCommit` when applicable.

This is a regression caused in #784. The core bug only exists when we
have a pending unacked `commit_sig`, but since we only send the
`AvailableBalanceChanged` event when sending a signature (not when
receiving a revocation), actors relying on this event to know the
current available balance (e.g. the `Relayer`) will have a wrong
value in-between two outgoing sigs.

* Bolt4: remove final_expiry_too_soon error message (#1106)

It allowed probing attacks and the spec deprecated it in favor of IncorrectOrUnknownPaymentDetails.
Also add better support for unknown failure messages.

* Fix maven mirror (#1120)

* Use Long to back the UInt64 type (#1109)

* Define comparison operators between UInt64 and MilliSatoshi

* Implement Bolt 11 invoice feature bits (#1121)

lightning/bolts#656 introduced invoice feature bits as a pre-requisite for AMP and other advanced payment use-cases.

* Update docker build (#1123)

* Update docker base image to jdk11, update maven to 3.6.2 [ci skip]

* Reject expired invoices before payment flow starts (#1117)

* Made sync params configurable (#1124)

This allows us to choose smaller parameters for tests and reduce cpu
requirement during testing.

NB: The default value of 3500 for `reply_channel_range` was wrong. Theoretical max is ~2700.

* Activate support for variable-length onion (#1087)

This is now enabled by default.
We forward variable-length onions if we receive some.
We accept variable-length payments.
However for maximum compatibility with the network, we send payments using legacy payloads.

* Add Semaphore CI (#1125)

* Router computes network stats (#1116)

* Add comments and fix warnings in graph processing
* Add small feature to set the htlcMaximumMsat for routing hints (otherwise the graph processing algorithm used a minimum value which slightly reduced the benefits of those routing hints)
* Add the computation of network statistics to the router: this will be useful for multi-part payments to decide what thresholds should be used to split a payment

* Add monitoring with Kamon (disabled by default) (#1126)

For now:
- we only track some tasks (especially in the router, but not even
`node_announcement` and `channel_update`
- all db calls are monitored
- kamon is disabled by default

* Check funds in millisatoshi when sending/receiving an HTLC (#1128)

Instead of satoshi, which could introduce rounding errors.

Also, we check first the balance before the max-inflight amount, because
it makes more sense in terms of error management.

Co-Authored-By: Bastien Teinturier <31281497+t-bast@users.noreply.github.com>

* Don't hardcode the channel version (#1129)

Instead of hardcoding the channel version when we instantiate the
`Commitments` object, we rather define it when the channel is
instantiated. This is saner and prepares future usage.

* Removed Globals class (#1127)

This is a prerequisite to parallelization of tests.

* Make tests run in parallel (#1112)

There are two level of parallelization:
- between test suites (a suite = a test file)
- within a suite (depends on tests suites, some rely on sequential execution of tests, some don't)

* Add codecov integration to semaphore CI (#1134)

* Remove codecov integration from travis CI

* Drop support for Java 8 (#1135)

We already have Java 7 (for Android) and Java 11. Supporting Java 8
would require crossbuilding, which we are not doing (two recent PRs
broke the build on Java 8).

* Sphinx: accept invalid downstream errors (#1137)

When a downstream node sends us an onion error with an invalid length, we must forward the failure.
The recipient won't be able to extract the error but at least it knows the payment failed.

* Update string to match on bitcoind while it's indexing (#1138)

* Check for bitcoind's getrawtransaction availablilty during startup

* Peer: disable kamon

* Payment lifecycle refactoring (#1130)

* Unify payment events (no more duplication between payment types and events)
* Factorize DB and eventStream interactions: this paves the way for sub-payments that shouldn't be stored in the DB nor emit events.
* Add more fields to the payments DB:
  * bolt 11 invoice for sent payment
  * external id (for app developers)
  * parent id (AMP)
  * target node id
  * fees
  * route (if success)
  * failures (if failed)
* Re-work the PaymentsDb interface
* Clarify use of seconds / milliseconds in DB interfaces -> milliseconds everywhere
* Run SQL migrations inside transactions

* Improve error handling when we couldn't find all the channels for a supplied route in /sendtoroute API (#1142)

* Improve error handling when we couldn't find all the channels for a supplied route in /sendtoroute

* Handle fees increases when channel is OFFLINE (#1080)

* Add 'close-on-offline-feerate-mismatch' configuration to avoid closing offline channel when the feerate mismatch if over the threshold.

* Derive channel keys from the channel funding pubkey (#1097)

We now generate a random funding key for each new channel, and use its public key to deterministically derive all channel keys and secrets. This will let us easily recover funds using DLP even if we've lost everything but our seed: we just need to connect to the node we had a channel with, ask them to publish their commit tx, and once we see it on the blockchain we can extract our funding pubkey, recompute channel keys and spend our output.

* Add a "funding pubkey path" option to the channel version field

This option is checked when we need to compute channel keys. For old channels it won't be set, and we always set it for new ones.

* ChannelVersion: make sure that all bits are set to 0 for legacy channels

* ChannelVersion: USE_PUBKEY_KEYPATH is set by default

* Check if remote funder can handle an updated commit fee when sending HTLC (#1084)

If the sender of an htlc isn't the funder, then both sides will have to afford the payment:
- the sender needs to be able to afford the htlc amount
- the funder needs to be able to afford the greater commit tx fee incurred by the additional htlc output.

Fixes #1081.

Co-Authored-By: Pierre-Marie Padiou <pm47@users.noreply.github.com>

* Fix and expand channel keypath (#1147)

* Fix funding pubkey to channel key path computation

Channel key path is generated from 8 bytes computed from our funding pubkey, but we extracted 4 uint32 values instead of 2 (last 2 were always 0). We now use 128 bits to derive channel key paths.

* Add a channel key path compatibility test

This test will fail if we change the way we compute channel key paths, which would break existing channels.

* Use the same chain hash reference in all channel updates

To save memory, once we check that a channel_update's chain hash matches what
we expect we just replace it with a reference to our own chain hash.

* Commitments: take HTLC fee into account (#1152)

Our balance computation was slightly incorrect. If you want to know how much you can send (or receive), you need to take into account the fact that you'll add a new HTLC which adds weight to the commit tx (and thus adds fees).

* Android: add a spray-based API to eclair-node

This is a copy of the spray-based API developped by @araspitzu (akka-http does not
work for akka 2.3 which we use on the android branch)

* HTTP API: add type hints for payment status (#1150)

Cleans up the JSON payment status (easier to interpret for callers).

* Use "mock" Kamon library

Kamon does not work on Android and does not make much sense, so we replace
it with a basic Mock implementation that does nothing.

* Electrum: improve coin selection (fixes #1146) (#1149)

Our previous coin selection would sometimes fail when there was one wallet utxo and and low 
 feerate, because our first pass used a fee estimate that was too high and could sometimes not be met.

* Extend funding key path to 256 bits (#1154)

Our random funding key path is now 8 * 32 bits plus a 1' (funder) or 0' (fundee).
Channel key paths are computed from the sha256 of the funding public key (we take all 256 bits).

* Use bitcoin 0.18.1 in the test (#1148)

* Upgrade new unit tests to bitcoin 0.18.1 API (#1157)

We had 2 open PRs, one that added new tests using the 0.API, one that switched to 0.18.1, when they were merged the new tests failed since they had not been upgraded....

* Update netty dependency to 4.1.32 (#1160)

Also:
* explicitely set endpoint identification algorithm in strict mode
* force TLS protocols 1.2/1.3 in strict mode

Co-Authored-By: Bastien Teinturier <31281497+t-bast@users.noreply.github.com>

* Add execution time limit (#1161)

* Android: wipe channels table during db migration

We already wipe the updates table, and this make upgrading much simpler since we had different structures on
android vs mater.

* Activate extended channel range queries (#1165)
By default we now set the `gossip_queries_ex` feature bit.
We also change how we compare feature bits, and will use channel queries (or extended queries) only if the corresponding feature bit is set in both local and remote init messages.

* Use guava to compute CRC32C checksums (#1166)

CRC32C is not available in JDK 7 which we target on Android.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants