Defer ChainMonitor completion signals to ensure manager-first persistence by joostjager · Pull Request #4422 · lightningdevkit/rust-lightning

joostjager · 2026-02-16T08:56:09Z

This is a simpler alternative to #4351 for solving the monitor/manager persistence ordering problem. Instead of deferring the monitor operations themselves (persist, insert, apply update), this PR only defers the completion signals. The operations execute inline as before, but ChainMonitor holds back the channel_monitor_updated calls until flush() is invoked. This means the ChannelManager sees all monitor updates as InProgress until the caller has persisted the ChannelManager and explicitly flushed.

Compared to #4351, this approach is simpler because it does not need to queue full monitor operations. There is no buffering of ChannelMonitor data or updates and no deferred persistence. The monitors are persisted immediately, and only the lightweight completion signal (ChannelId, u64) is queued. That said, this implementation does lean more towards implementing the deferral at the Persist level.

Benchmarking A->B node payments with a 150 ms write latency shows this approach is ~25% faster than #4351. The reason is that #4351 defers the actual monitor persist to flush time, which means the monitor write happens sequentially after the ChannelManager write. In this PR, the monitor write happens inline during the ChannelManager operation, so by the time the background processor persists the ChannelManager, the monitor is already on disk. The monitor and manager writes effectively overlap rather than running back-to-back.

When the background processor loop is further parallelized (#4419), another ~20% speed up is gained. Alltogether this makes the difference with non-deferred writing small.

Add a `deferred` parameter to `ChainMonitor::new` and `ChainMonitor::new_async_beta`. When set to true, the Watch trait methods (watch_channel and update_channel) will unimplemented!() for now. All existing callers pass false to preserve current behavior. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

ldk-reviews-bot · 2026-02-16T08:56:12Z

👋 Hi! I see this is a draft PR.
I'll wait to assign reviewers until you mark it as ready for review.
Just convert it out of draft status when you're ready for review!

codecov · 2026-02-16T10:31:50Z

Codecov Report

❌ Patch coverage is 95.65217% with 15 lines in your changes missing coverage. Please review.
✅ Project coverage is 86.12%. Comparing base (4e32d10) to head (8306add).
⚠️ Report is 48 commits behind head on main.

Files with missing lines	Patch %	Lines
lightning/src/chain/chainmonitor.rs	95.29%	8 Missing and 4 partials ⚠️
lightning/src/util/test_utils.rs	92.68%	3 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #4422      +/-   ##
==========================================
+ Coverage   86.06%   86.12%   +0.06%     
==========================================
  Files         156      156              
  Lines      103188   103941     +753     
  Branches   103188   103941     +753     
==========================================
+ Hits        88808    89522     +714     
- Misses      11868    11897      +29     
- Partials     2512     2522      +10

Flag	Coverage Δ
tests	`86.12% <95.65%> (+0.06%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

When ChainMonitor is constructed with deferred=true, monitor operations (persist, insert, apply update) still execute inline, but completion signals are held back in a queue. pending_operation_count() returns the queue length and flush(count) delivers up to that many completions via channel_monitor_updated(). The public channel_monitor_updated() method checks the deferred flag and queues completions rather than resolving them immediately. This ensures that both synchronous persistence completions and external async callers are properly deferred. flush() calls an internal non-deferring variant to actually deliver the signals. The BackgroundProcessor snapshots the pending count before persisting the ChannelManager, then flushes that many completions afterward. This ensures the ChannelManager is always persisted before its associated monitor completions are signaled, avoiding force closures from a crash between monitor and channel manager persistence. A test is added covering the interaction between deferred mode and async persistence (persister returning InProgress), verifying the two-phase completion flow: persister signals completion via channel_monitor_updated (queued into deferred_completions), then flush delivers them. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

joostjager · 2026-02-16T13:42:00Z

Doesn't work because once a monitor is on disk, channel manager has to be there too, regardless of when the completion signal comes.

joostjager · 2026-02-16T15:41:05Z

This approach could possibly still work if after restart monitors are only read up to the point that chan mgr is aware of. And care is taken with cleanup of old updates.

joostjager mentioned this pull request Feb 16, 2026

Defer ChainMonitor updates and persistence to flush() #4351

Open

joostjager force-pushed the chain-mon-deferred-completion branch from 159b30c to 8306add Compare February 16, 2026 12:43

joostjager closed this Feb 16, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Defer ChainMonitor completion signals to ensure manager-first persistence#4422

Defer ChainMonitor completion signals to ensure manager-first persistence#4422
joostjager wants to merge 2 commits intolightningdevkit:mainfrom
joostjager:chain-mon-deferred-completion

joostjager commented Feb 16, 2026 •

edited

Loading

Uh oh!

ldk-reviews-bot commented Feb 16, 2026

Uh oh!

codecov bot commented Feb 16, 2026 •

edited

Loading

Uh oh!

joostjager commented Feb 16, 2026

Uh oh!

joostjager commented Feb 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

joostjager commented Feb 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ldk-reviews-bot commented Feb 16, 2026

Uh oh!

codecov bot commented Feb 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

joostjager commented Feb 16, 2026

Uh oh!

joostjager commented Feb 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

joostjager commented Feb 16, 2026 •

edited

Loading

codecov bot commented Feb 16, 2026 •

edited

Loading