Skip to content

Fix TOCTOU race in gossip discovery handleAliveMessage#5397

Open
Ady0333 wants to merge 1 commit intohyperledger:mainfrom
Ady0333:fix-gossip-discovery-toctou-race
Open

Fix TOCTOU race in gossip discovery handleAliveMessage#5397
Ady0333 wants to merge 1 commit intohyperledger:mainfrom
Ady0333:fix-gossip-discovery-toctou-race

Conversation

@Ady0333
Copy link
Contributor

@Ady0333 Ady0333 commented Feb 12, 2026

Type of change

  • Bug fix

Description

This PR fixes a TOCTOU race in gossip discovery that could crash a peer during normal operation.

handleAliveMessage() previously read membership state using multiple separate RLock / RUnlock sections and then treated the combined result as consistent. If a concurrent purge() ran between these lock gaps (for example during identity changes or message expiration), a member could appear as “known” but neither alive nor dead, triggering a Panicf. In another path, the same race could lead to a nil pointer dereference in learnExistingMembers().

The fix consolidates all membership state reads into a single atomic RLock section and adds a defensive nil guard when updating existing members. This ensures inconsistent intermediate states are never observable and avoids crashing the peer.


Additional details

The change is localized to gossip discovery logic and does not affect normal behavior.
Tested locally with go test ./gossip/....


Release Note

Fixes a gossip discovery race where concurrent member purging could cause
a peer panic during normal operation.

Consolidate three separate RLock/RUnlock pairs into a single atomic
read to prevent concurrent purge() from removing a member between
checks, which caused a panic ("not found in alive nor dead") or nil
pointer dereference in learnExistingMembers.

Signed-off-by: Ady0333 <adityashinde1525@gmail.com>
@Ady0333 Ady0333 requested a review from a team as a code owner February 12, 2026 10:41
@Ady0333
Copy link
Contributor Author

Ady0333 commented Feb 12, 2026

Hi @yacovm, when you get a chance, could you please review this PR?
Thanks!

@C0rWin
Copy link
Contributor

C0rWin commented Feb 20, 2026

@Ady0333, can you please add a unit test that manifests the issue you are solving here? Exactly like you did with #5379?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants