chaintopology: fix infinite RBF loop after replacement tx is confirmed#8935
Open
enaples wants to merge 2 commits intoElementsProject:masterfrom
Open
chaintopology: fix infinite RBF loop after replacement tx is confirmed#8935enaples wants to merge 2 commits intoElementsProject:masterfrom
enaples wants to merge 2 commits intoElementsProject:masterfrom
Conversation
…firms
After consider_onchain_rebroadcast() creates a higher-fee replacement,
rebroadcast_txs() calls refresh() which updates otx->tx in-place, but
the confirmation guard still queries the original otx->txid (the map key):
if (wallet_transaction_height(topo->ld->wallet, &otx->txid))
continue;
Because the original tx was never mined (only the replacement was),
wallet_transaction_height always returns 0 and the RBF loop fires on
every subsequent block forever, even after the channel is fully resolved.
Fix: compute cur_txid from the current otx->tx before the guard. This
naturally reflects any replacement made by a prior refresh() call, so
wallet_transaction_height finds the confirmed txid and skips the entry.
otx->txid (the hash-map key) is intentionally left unchanged: mutating
the key in-place while the entry lives in the map would corrupt the table.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Important
26.04 FREEZE March 11th: Non-bugfix PRs not ready by this date will wait for 26.06.
RC1 is scheduled on March 23rd
The final release is scheduled for April 15th.
Checklist
Before submitting the PR, ensure the following tasks are completed. If an item is not applicable to your PR, please mark it as checked:
tools/lightning-downgradeProblem
This issue raised twice. Last time has been raised on #8921.
rebroadcast_txs()inlightningd/chaintopology.citerates over all pending outgoing transactions once per block and, for each one, asks: "has this tx already been confirmed?" before deciding whether to rebroadcast or RBF it. The confirmation check is:otx->txidis initialised from the tx that was originally broadcast and is also the key of theoutgoing_tx_maphash table. It is never updated after a replacement (RBF) transaction is created.When
otx->refresh(i.e.consider_onchain_rebroadcast()) is called, it may produce a higher-fee replacement and write it intootx->tx. The replacement has a different txid, butotx->txidstill holds the original. The original was never mined (only the replacement was) sowallet_transaction_height(original_txid)returns 0 forever, thecontinueis never taken, and
consider_onchain_rebroadcastis invoked on every subsequent block even though the output is already resolved.Test scenario rationale: Why penalty transactions trigger this and to-local sweeps do not
The bug requires actual fee escalation to be observable:
consider_onchain_rebroadcastonly emits an INFO log and produces a new tx whennewfee > info->fee. Fee escalation happens whenfeerate_for_target(deadline)returns an increasing value.close_blockheight + to_self_delay. Withwatchtime-blocks = 10the deadline is only 10 blocks away at close, sofeerate_for_targetreturns a higher value on each new block.slow_sweep_deadline(~300 blocks, hardcoded).feerate_for_targetreturnsFEERATE_FLOORuntil the sweep is imminent, so fees never change, no new tx is produced, and the bug isinvisible.
Why mutating
otx->txiddirectly would crashThe first attempted fix (
bitcoin_txid(otx->tx, &otx->txid)after the refresh call) caused aSIGABRT/SIGSEGVon shutdown.otx->txidis the key of theoutgoing_tx_maphtable (keyof_outgoing_tx_mapreturns&t->txid). Changing the key in-place while the entry lives in the table corrupts its internal bucket chain, producing a use-after-free when the topology is torn down indestroy_chain_topology.Suggested Fix
Instead of storing a separate up-to-date txid (which would require touching the map key or adding a new field), we compute the txid of the current
otx->txon the fly at the top of the loop:otx->txis always the latest version of the transaction: it is set to the original tx at broadcast time and updated by each successfulrefresh()call. Sobitcoin_txid(otx->tx)naturally tracks whichever version was most recentlybroadcast (including any RBF replacement) without touching the map key.
otx->txidis left completely unchanged.The SHA-256d required by
bitcoin_txidis negligible compared with the round-trip to bitcoind that follows immediately; the cost is not a concern.