consensus/clique: only trust snapshot for genesis or les checkpoint #17620
+1
−1
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is the fix for the Rinkeby consensus split.
When adding the light client checkpoint sync support for Rinkeby (Clique), we needed to relax the requirement that signing/voting snapshots are generated from previous blocks, and rather trust a standalone epoch block in itself, similar to how we trust the genesis (so light nodes can sync from there instead of verifying the entire header chain).
The oversight however was that the genesis block doesn't have previous signers (who can't sign currently), whereas checkpoint blocks do have previous signers. The checkpoint sync extension caused Clique nodes to discard previous signers at epoch blocks, allowing any authorized signer to seal the next block.
This caused signers running on v1.8.14 and v1.8.15 to create an invalid block, sealed by a node that already sealed recently and shouldn't have been allowed to do so, causing a consensus split between new nodes and old nodes.
This PR fixes the issue by making the checkpoint snapshot trust more strict, only ever trusting a snapshot block blindly if it's the genesis or if its parent is missing (i.e. we're starting sync from the middle of the chain, not the genesis). For all other scenarios, we still regenerate the snapshot ourselves along with the recent signer list.
Note, this hotfix does still mean that light clients are susceptible for the same bug - whereby they accept blocks signed by the wrong signers for a couple blocks - following a LES checkpoint, but that's fine because as long as full nodes correctly enforce the good chain, light clients can only ever import a couple bad blocks before the get stuck or switch to the properly validated chain. After
len(signers) / 2
blocks after initial startup, light clients become immune tho this "vulnerability" as well.