Synchronisation failed, dropping peer; err="retrieved hash chain is invalid"; message loop #17849

Closed

Synchronisation failed, dropping peer; err="retrieved hash chain is invalid"; message loop#17849

I have clique private proof-of-authority chain.

Geth minimum version: v1.8.16-stable

I have updated all signer nodes' (currently I have 3 signer nodes) geth version to minimum Version: 1.8.16-stable. Also I have updated the other node that gives the error to v1.8.16.

The way I run my geth also tried without --syncmode fast flag.

geth --syncmode fast --cache=1024 --shh --datadir $DATADIR/private --rpcaddr 127.0.0.1 --rpc --rpcport 8545 --rpccorsdomain="*" --networkid 12345 --rpcapi admin,eth,net,web3,debug,personal,shh

Error I am having on multiple nodes that are connected into the network.

########## BAD BLOCK #########
Chain config: {ChainID: 23422 Homestead: 1 DAO: <nil> DAOSupport: false EIP150: 2 EIP155: 3 EIP158: 3 Byzantium: 4 Constantinople: <nil> Engine: clique}
    
Number: 1260001
Hash: 0x659e96f35e1fa1c39fc3b8370a336f78787e482aef44e56bbe6dd9e10bb06bdc
    
    
Error: recently signed
##############################

WARN [10-05|15:49:57.694] Synchronisation failed, dropping peer    peer=ae57fcb24c19102e err="retrieved hash chain is invalid"
INFO [10-05|15:49:57.694] message loop                             peer=ae57fcb24c19102e err=EOF
ERROR[10-05|15:50:07.707]
########## BAD BLOCK #########
Chain config: {ChainID: 23422 Homestead: 1 DAO: <nil> DAOSupport: false EIP150: 2 EIP155: 3 EIP158: 3 Byzantium: 4 Constantinople: <nil> Engine: clique}

Number: 1260001
Hash: 0x659e96f35e1fa1c39fc3b8370a336f78787e482aef44e56bbe6dd9e10bb06bdc

I have reverted back the blockchain into some previous block number, debug.setHead("0x124F80") (1200000 th block) but it did not help.
Please note that I have to remove my chaindata geth removedb and sync from the start, which also didn't help.

holiman

Contributor

This is a known issue introduced in 1.8.14, iirc. Fixed in #17620

holiman

closed this as completed

reopened this

Contributor

You originally reported this on Version: 1.8.15-unstable. So are there still nodes running Version: 1.8.15-unstable? And are those still present? If so, they may be still advertising a chain which is invalid, and is being rejected by your node.

avatar-lavventura

Author

@holiman: I have updated all signer nodes' (currently I have 3 signer nodes) geth version to minimum Version: 1.8.16-stable. Also I have updated the other node that gives the error to v1.8.16.

I have rsync the chain from genesis block, but unfortunately, I am still having the same error.

Possible Solution:

Should I take back the signer nodes' blockchain data to point where error does not occurred using debug.setHead()? This could be a short-term solution where the same error could be occurred again.

karalabe

Member

~~This was fixed on master. I.e. stable release will arrive today.~~

karalabe

Member

Oh wait, I thought the invalid hash chain error. Then this is not what I thought, sorry for the noise.

karalabe

Member

@avatar-lavventura Yeah, you might need to rewind the chain back to before the faulty snapshot block / epoch transition. The signers won't realize some past block became invalid.

avatar-lavventura

Author

@karalabe: I rewinded all the signers and it has been fixed.

Might this error may occur again after the rewind? I didn't understand what is the main reason behind this error.

karalabe

Member

There was a bug in one Geth release (v1.8.14/v1.8.15) that violated the Clique consensus spec, causing some signers to create blocks when they weren't allowed to (epoch transition). All previous and subsequent version of Geth (apart from the faulty one) correctly rejected those blocks, hence why you couldn't sync a new node to your already mined chain.

A node however does not re-validate blocks when you update it, so even though you updated your signers, they were oblivious to the fact that a faulty block was already in their chain. When you rewound the chain, the signers had to re-mine the faulty segment, correcting the issue.

This should most definitely not happen again, as long as you don't use the faulty version of Geth. Any version equal or above to v1.8.16 should work just fine.

karalabe

closed this as completed

on Oct 9, 2018

kennethhoytwoodruff

I am running Geth/v1.8.26-stable-cdae1c59/linux-arm64/go1.11.8.

I have made several attempts over the past few days to sync the ropsten network. I always get pretty far along and things are going smoothly...

eth.syncing
{
currentBlock: 5414169,
highestBlock: 5418913,
knownStates: 107811598,
pulledStates: 107811598,
startingBlock: 5414168
}

But THEN..... I get the dreaded: "BAD BLOCK" followed by the "Synchronisation failed, dropping peer peer=35a1db7f4a92d6d6 err="retrieved hash chain is invalid" message - and everything comes to a halt. Nothing precipitates this failure. It just happens.

Then I make another attempt - which always fails after days of synching.

I am getting really frustrated and am thinking about using Parity instead. Any suggestions?

folex

I'm facing similar issue on Rinkeby network. A lot of errors (see below) about BAD BLOCK started to appear at May 04 05:33 UTC:

WARN [05-04|08:41:57.197] Synchronisation failed, dropping peer    peer=c71d079f0ce96e1d err="retrieved hash chain is invalid"
ERROR[05-04|08:56:00.458]
########## BAD BLOCK #########
Chain config: {ChainID: 4 Homestead: 1 DAO: <nil> DAOSupport: true EIP150: 2 EIP155: 3 EIP158: 3 Byzantium: 1035301 Constantinople: 3660663  ConstantinopleFix: 9999999 Engine: clique}

Number: 4321234
Hash: 0xe2fa06d53b28bfa053e022686d6106026f8a1d5fe40e0eccd09e3f7165acd424
         0: cumulative: 30107 gas: 30107 contract: 0x0000000000000000000000000000000000000000 status: 1 tx: 0x18abad37269a35c25b125039e82f62ed95b00fd0644c7c590530297d1bef8a27 logs: [0xc06938c2c0] bloom: 00000000000000000000000000000000000010000000000000000000200000000000000000000000000000000000000000000002000000000000000000000000000000000000000000000000000000002000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000200000000000000000000000000000000000000000000008000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000100000000000000000000000000000000000000000000000001000010000000 state:
         1: cumulative: 69155 gas: 39048 contract: 0x0000000000000000000000000000000000000000 status: 1 tx: 0x6ce0d490ddc5462db6ff1a33c82360455bac423587a1c8f3d3b57910ec32e716 logs: [0xc06938c370] bloom: 00000000080000000200000000000000000000000000000000040000000000000000000010000000000000000000000000000400200000000000000000000000100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000004000000000100000000020000002000000000000000000000100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000002000000000000000000000000000002400000000000000000000000000000000000000000000000000000 state:
         2: cumulative: 436538 gas: 367383 contract: 0x0000000000000000000000000000000000000000 status: 1 tx: 0xe53644bda6ac3f7880d08a56438b3d8a2f7be46b3d91bd48a95e207b803ca3b2 logs: [] bloom: 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 state:
         3: cumulative: 459950 gas: 23412 contract: 0x0000000000000000000000000000000000000000 status: 0 tx: 0xa5bb5d5451600310962f95cccd42963551c950d87048ea94563f43b07c7f79cb logs: [] bloom: 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 state:
         4: cumulative: 488879 gas: 28929 contract: 0x0000000000000000000000000000000000000000 status: 1 tx: 0xbebcde90a9f479d131d9e882349fa49f4495b4c54608fac0ecbe9b2c52727729 logs: [] bloom: многонулей state:


Error: invalid gas used (remote: 517679 local: 488879)

This all happened on Geth v1.8.23, running in a docker container ethereum/client-go:stable. I then pulled a new container with Geth/v1.8.27-stable-4bcc0a37/linux-amd64/go1.11.9 inside.

I restarted the container, and it started to sync:

INFO [05-04|08:59:31.162] Initialised chain configuration          config="{ChainID: 4 Homestead: 1 DAO: <nil> DAOSupport: true EIP150: 2 EIP155: 3 EIP158: 3 Byzantium: 1035301 Constantinople: 3660663  ConstantinopleFix: 4321234 Engine: clique}"
INFO [05-04|08:59:31.164] Initialising Ethereum protocol           versions="[63 62]" network=4
WARN [05-04|08:59:31.236] Head state missing, repairing chain      number=4321233 hash=3fb324…efc3d0
INFO [05-04|09:00:24.833] Rewound blockchain to past state         number=4207869 hash=f51389…9ec47a
INFO [05-04|09:00:24.833] Loaded most recent local header          number=4321233 hash=3fb324…efc3d0 td=7948872 age=3h27m54s
INFO [05-04|09:00:24.833] Loaded most recent local full block      number=4207869 hash=f51389…9ec47a td=7747159 age=2w5d19h
INFO [05-04|09:00:24.833] Loaded most recent local fast block      number=4321233 hash=3fb324…efc3d0 td=7948872 age=3h27m54s

Syncing was pretty slow (comparing to usual sync speed), and following errors started to appear in logs (pasting first appearance):

INFO [05-04|09:07:00.646] Imported new chain segment               blocks=350  txs=3020 mgas=538.680 elapsed=8.008s    mgasps=67.263 number=4217719 hash=2e32a4…39f1fb age=2w4d2h   cache=164.80mB
WARN [05-04|09:07:06.419] Synchronisation failed, dropping peer    peer=2573096ae36ad36b err="retrieved hash chain is invalid"
INFO [05-04|09:07:29.265] Importing heavy sidechain segment        blocks=2048 start=4217903 end=4219950
ERROR[05-04|09:07:30.005] Impossible reorg, please file an issue   oldnum=4217901 oldhash=d37168…60fa2f newnum=4217901 newhash=d37168…60fa2f
INFO [05-04|09:07:37.283] Imported new chain segment               blocks=234  txs=2360 mgas=437.251 elapsed=8.017s    mgasps=54.534 number=4218136 hash=8a6ccf…326568 age=2w4d1h   cache=167.58mB
INFO [05-04|09:07:45.306] Imported new chain segment               blocks=311  txs=3751 mgas=654.378 elapsed=8.022s    mgasps=81.568 number=4218447 hash=a325de…b3c61d age=2w3d23h  cache=170.52mB
WARN [05-04|09:07:45.488] Synchronisation failed, dropping peer    peer=2573096ae36ad36b err="retrieved hash chain is invalid"
INFO [05-04|09:08:06.623] Importing heavy sidechain segment        blocks=2048 start=4218456 end=4220503
ERROR[05-04|09:08:07.846] Impossible reorg, please file an issue   oldnum=4218454 oldhash=92baf1…c79693 newnum=4218454 newhash=92baf1…c79693

By now, log is full of such errors:

ERROR[05-04|11:45:05.849] Impossible reorg, please file an issue   oldnum=4320754 oldhash=040e63…fe7b8a newnum=4320754 newhash=040e63…fe7b8a
WARN [05-04|11:45:06.093] Synchronisation failed, dropping peer    peer=1aabba770181eef1 err="retrieved hash chain is invalid"
INFO [05-04|11:46:26.532] Importing sidechain segment              start=4320768 end=4322616
ERROR[05-04|11:46:26.558] Impossible reorg, please file an issue   oldnum=4320766 oldhash=3619e9…10ef3d newnum=4320766 newhash=3619e9…10ef3d
WARN [05-04|11:46:27.053] Synchronisation failed, dropping peer    peer=94d2999235ecaef2 err="retrieved hash chain is invalid"

But it's still moving forward, not stalling, but very-very slow (like 1 block in a minute or slower).

Network: Rinkeby
Version: Geth/v1.8.27-stable-4bcc0a37/linux-amd64/go1.11.9
Run args: --rinkeby --rpc --rpccorsdomain "*" --rpcaddr '<HIDDEN IP>' --rpcport 8545 --ws --wsaddr '<HIDDEN IP>' --wsport <HIDDEN PORT> --verbosity 3 --datadir /root/.ethereum --v5disc --rpcvhosts=* --nat extip:<HIDDEN IP> --lightserv 90 --lightpeers 100

UPDATE: Now it has stalled on block 0x41f0d8 for an hour so far
UPDATE 2: I've launched another rinkeby full node, and it synced just fine

ProDog

@folex It's ok after I upgrade my go-ethereum to the latest

to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

Labels

No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Synchronisation failed, dropping peer; err="retrieved hash chain is invalid"; message loop #17849

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Synchronisation failed, dropping peer; err="retrieved hash chain is invalid"; message loop #17849

Description

Activity

holiman commented on Oct 5, 2018

holiman commented on Oct 5, 2018

avatar-lavventura commented on Oct 5, 2018

karalabe commented on Oct 8, 2018

karalabe commented on Oct 8, 2018

karalabe commented on Oct 8, 2018

avatar-lavventura commented on Oct 8, 2018

karalabe commented on Oct 9, 2018

kennethhoytwoodruff commented on Apr 16, 2019

folex commented on May 4, 2019

ProDog commented on May 11, 2019

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Issue actions