Only splice when needed by amackillop · Pull Request #14 · moneydevkit/rust-lightning

amackillop · 2026-04-16T20:32:09Z

Problem

Every time a peer reconnects with pending HTLCs, the LSP triggers a splice that turns out to be unnecessary.

The sequence: LSP intercepts HTLC for offline peer, sends webhook, peer reconnects. peer_connected fires before channel_reestablish completes, so all channels are is_usable=false. calculate_htlc_actions_for_peer builds its capacity map from usable channels (empty), finds zero capacity, falls through to the splice filter which matches on is_channel_ready (finds the mid-reestablish channel), and emits a splice. ~7 seconds later the splice completes. Meanwhile reestablish finished in ~100-500ms and the HTLC was forwarded through the existing channel that had plenty of capacity all along.

The wasted splice also sets a 30s liquidity cooldown, blocking all subsequent liquidity actions for that peer.

Fix

Two commits, split because the first is a prerequisite for the second.

1. Persist HTLCs before action calculation

htlc_intercepted only persisted HTLCs when a liquidity action was needed. This prevents calculate_htlc_actions_for_peer from ever returning "do nothing" (empty actions) because the HTLC would be lost — not in the store for the timer, not forwarded.

Move the insert before action calculation. The cost is two extra S3 round-trips (write + delete) per HTLC on the forward path, where previously there were none. This hits every payment. Unavoidable because we don't know the action result before calculating it.

Also replaces the unwrap() on insert with explicit error handling: best-effort for forwards, fail-back for liquidity actions.

2. Prevent spurious splice on reconnect

Two changes in calculate_htlc_actions_for_peer:

Early return when channels exist but none are usable. The capacity map is empty so any decision (splice, open, forward) would be wrong. The HTLC stays in the store and the 1Hz timer retries once reestablish completes.
Splice candidate filter changed from is_channel_ready to is_usable. A mid-reestablish channel may already have sufficient capacity that just isn't visible yet. Splicing adds unnecessary on-chain cost for capacity that was never actually insufficient.

Verification

cargo test -p lightning-liquidity --lib — 65/65 pass
After deploying: logs should show "deferring decision" instead of "Splicing into channel" on reconnect
Payment latency should drop from ~10s (3s startup + 7s splice) to ~4s (3s startup + ~1s timer tick)

htlc_intercepted only persisted HTLCs when a liquidity action (splice/open) was needed. This prevents calculate_htlc_actions from returning empty actions (i.e. "do nothing, let the timer handle it") because the HTLC would never make it into the store and the timer would never see it. Move the insert before calculate_htlc_actions_for_peer so every intercepted HTLC is in the store before we decide what to do with it. execute_htlc_actions already removes the HTLC on successful forward, so the store entry is short-lived on the happy path. The cost is two extra KV store round-trips (S3 write + delete) per HTLC on the forward-only path, where previously there were none. This hits every payment since LSPS4 intercepts all of them. Unfortunately unavoidable: the insert must happen before action calculation because we don't know yet whether the result will be "forward now" or "do nothing and defer to the timer." If we wait until after and the result is "do nothing," we've already lost the HTLC. Replace the unconditional unwrap on insert with explicit error handling. On persistence failure: - Forward-only: proceed. If the forward succeeds the store entry was redundant. If persistence failed and the channel becomes unusable between calculate and execute, the HTLC is orphaned (not in store for timer retry, not forwarded) until LDK's CLTV timeout cleans it up. Requires both S3 failure and a channel state change between two adjacent calls. - Liquidity action needed: fail the HTLC back to sender. The splice/open is async and the timer needs the stored HTLC to forward once the channel is ready. Without it, we'd spend on-chain fees on a splice that never results in a forward.

When the LSP intercepts an HTLC for an offline peer, the peer reconnects and peer_connected fires before channel_reestablish completes. All channels report is_usable=false at this point. calculate_htlc_actions_for_peer builds its capacity map from is_usable channels (empty set), finds zero capacity, then falls through to the splice candidate filter. That filter used is_channel_ready, which matches mid-reestablish channels, so it emits a splice. Once reestablish finishes, the HTLC gets forwarded through the existing channel. The splice was wasted, and the 30s liquidity cooldown it sets blocks all subsequent liquidity actions for that peer. Two fixes in calculate_htlc_actions_for_peer: 1. Early return when channels exist but none are usable. The capacity map is empty so any decision would be wrong. The HTLC stays in the store (persisted by the prior commit) and the 1Hz timer retries once channels become usable. 2. Splice candidate filter changed from is_channel_ready to is_usable. A mid-reestablish channel maye already have sufficient capacity but it's just not visible yet. Splicing into it adds unnecessary on-chain cost and a 30s cooldown for capacity that was never actually insufficient. splice_channel() would also fail if the channel does become usable in time.

amackillop added 2 commits April 16, 2026 21:18

amackillop changed the title ~~Austin only splice when needed~~ Only splice when needed Apr 16, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Only splice when needed#14

Only splice when needed#14
amackillop wants to merge 2 commits intoaustin_defer-usability-checkfrom
austin_only-splice-when-needed

amackillop commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

amackillop commented Apr 16, 2026

Problem

Fix

1. Persist HTLCs before action calculation

2. Prevent spurious splice on reconnect

Verification

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant