Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NACOS 1.0.0 code:503 msg: server is DOWN now, please try again later! #1189

Closed
haochencheng opened this issue May 8, 2019 · 20 comments
Closed

Comments

@haochencheng
Copy link

2019-05-08 11:17:09.746  INFO 50906 --- [ing.beat.sender] com.alibaba.nacos.client.naming          : [BEAT] 68da671b-ef91-47e0-b5d0-d3a458bd743a sending beat to server: {"cluster":"DEFAULT","ip":"192.168.69.244","metadata":{"preserved.register.source":"SPRING_CLOUD"},"port":40100,"scheduled":true,"serviceName":"DEFAULT_GROUP@@microservice-integration-gateway","weight":1.0}
2019-05-08 11:17:09.748 DEBUG 50906 --- [ing.beat.sender] com.alibaba.nacos.client.naming          : Request from server: http://127.0.0.1:8848/nacos/v1/ns/instance/beat?beat=%7B%22cluster%22%3A%22DEFAULT%22%2C%22ip%22%3A%22192.168.69.244%22%2C%22metadata%22%3A%7B%22preserved.register.source%22%3A%22SPRING_CLOUD%22%7D%2C%22port%22%3A40100%2C%22scheduled%22%3Atrue%2C%22serviceName%22%3A%22DEFAULT_GROUP%40%40microservice-integration-gateway%22%2C%22weight%22%3A1.0%7D&serviceName=DEFAULT_GROUP%40%40microservice-integration-gateway&encoding=UTF-8&namespaceId=68da671b-ef91-47e0-b5d0-d3a458bd743a
2019-05-08 11:17:09.751 ERROR 50906 --- [ing.beat.sender] com.alibaba.nacos.client.naming          : request 127.0.0.1:8848 failed.

com.alibaba.nacos.api.exception.NacosException: failed to req API:http://127.0.0.1:8848/nacos/v1/ns/instance/beat. code:503 msg: server is DOWN now, please try again later!
	at com.alibaba.nacos.client.naming.net.NamingProxy.callServer(NamingProxy.java:340)
	at com.alibaba.nacos.client.naming.net.NamingProxy.reqAPI(NamingProxy.java:367)
	at com.alibaba.nacos.client.naming.net.NamingProxy.reqAPI(NamingProxy.java:304)
	at com.alibaba.nacos.client.naming.net.NamingProxy.sendBeat(NamingProxy.java:227)
	at com.alibaba.nacos.client.naming.beat.BeatReactor$BeatTask.run(BeatReactor.java:109)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:266)
	at java.util.concurrent.FutureTask.run(FutureTask.java)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

2019-05-08 11:17:09.751 ERROR 50906 --- [ing.beat.sender] com.alibaba.nacos.client.naming          : [CLIENT-BEAT] failed to send beat: {"cluster":"DEFAULT","ip":"192.168.69.244","metadata":{"preserved.register.source":"SPRING_CLOUD"},"port":40100,"scheduled":true,"serviceName":"DEFAULT_GROUP@@microservice-integration-gateway","weight":1.0}

@leavegee
Copy link

leavegee commented May 8, 2019

the same to you .

@leavegee
Copy link

leavegee commented May 8, 2019

solved. cluster.conf can`t mark localhost or 127.0.0.1 . must be specific ip or domain

@nkorange
Copy link
Collaborator

nkorange commented May 8, 2019

@haochencheng Check your error log on server, the directory is {nacos.home}/logs.

@imkiven
Copy link

imkiven commented May 9, 2019

也有同样的问题, cluster.conf 没有localhost或者127.0.0.1,三个节点,其中一个节点始终报这个错误,控制台可以打开

@imkiven
Copy link

imkiven commented May 9, 2019

@haochencheng 解决这个问题了吗

@haochencheng
Copy link
Author

@haochencheng 解决这个问题了吗

没有。

@haochencheng
Copy link
Author

127.0.0.1
when i cluster.conf mark localhost or 127.0.0.1 ,
it will be throw IllegalArgumentException like that

java.lang.IllegalArgumentException: server: 192.168.199.200:8848 is not in serverlist
	at com.alibaba.nacos.naming.cluster.ServerListManager.onReceiveServerStatus(ServerListManager.java:196)
	at com.alibaba.nacos.naming.cluster.ServerListManager$ServerStatusReporter.run(ServerListManager.java:415)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

i write my inet in cluster.conf like that

#it is ip
#example
192.168.199.200

and restart nacos , when id register a new service it will be return

server is DOWN now, please try again later!%

here is my naming-server.log

2019-05-09 15:35:59,465 INFO listen for service meta change

2019-05-09 15:36:02,362 INFO [SERVER-INIT] got port: 8848

2019-05-09 15:36:02,362 INFO [SERVER-INIT] got path: /nacos

2019-05-09 15:36:04,263 INFO receive config info: unknown#192.168.199.200:8848#1557387364262#2


2019-05-09 15:36:04,264 INFO [NACOS-DISTRO] healthy server list changed, disable health check for 60000 ms from now on, old: [], new: [{"adWeight":0,"alive":true,"ip":"192.168.199.200","key":"192.168.199.200:8848","lastRefTime":1557387364262,"lastRefTimeStr":"2019-05-09 15:36:04","servePort":8848,"site":"unknown","weight":2}]

2019-05-09 15:36:19,269 INFO receive config info: unknown#192.168.199.200:8848#1557387379269#2


2019-05-09 15:36:31,031 INFO [HEALTH-CHECK] health check is false

2019-05-09 15:36:34,274 INFO receive config info: unknown#192.168.199.200:8848#1557387394273#2


2019-05-09 15:36:49,278 INFO receive config info: unknown#192.168.199.200:8848#1557387409278#2

@haochencheng
Copy link
Author

@nkorange

@nkorange
Copy link
Collaborator

This is a server list error. What's the content in cluster.conf?

@imkiven
Copy link

imkiven commented May 10, 2019

我没有错误,但是我在群里看到很多人都是集群某一个节点有问题

@nkorange
Copy link
Collaborator

@haochencheng If you run Nacos in cluster mode, you have to configure at least 3 nodes in cluster.conf

@ljh205sy
Copy link

The same to you. How to fixed it?

@dolcevitaforever
Copy link

我也是这个问题, 集群里面都是配置的真实ip,注册服务的时候,其中一台机器一直报错503,服务也没注册上去 @nkorange
request 192.168.1.221:8848 failed.
com.alibaba.nacos.api.exception.NacosException: failed to req API:http://192.168.1.221:8848/nacos/v1/ns/instance/beat. code:503 msg: server is DOWN now, please try again later!
at com.alibaba.nacos.client.naming.net.NamingProxy.callServer(NamingProxy.java:340)
at com.alibaba.nacos.client.naming.net.NamingProxy.reqAPI(NamingProxy.java:367)
at com.alibaba.nacos.client.naming.net.NamingProxy.reqAPI(NamingProxy.java:304)
at com.alibaba.nacos.client.naming.net.NamingProxy.sendBeat(NamingProxy.java:227)
at com.alibaba.nacos.client.naming.beat.BeatReactor$BeatTask.run(BeatReactor.java:109)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2019-05-20 12:19:48.084 [com.alibaba.nacos.naming.beat.sender] ERROR com.alibaba.nacos.client.naming -
request 192.168.1.221:8848 failed.
com.alibaba.nacos.api.exception.NacosException: failed to req API:http://192.168.1.221:8848/nacos/v1/ns/instance/beat. code:503 msg: server is DOWN now, please try again later!
at com.alibaba.nacos.client.naming.net.NamingProxy.callServer(NamingProxy.java:340)
at com.alibaba.nacos.client.naming.net.NamingProxy.reqAPI(NamingProxy.java:367)
at com.alibaba.nacos.client.naming.net.NamingProxy.reqAPI(NamingProxy.java:304)
at com.alibaba.nacos.client.naming.net.NamingProxy.sendBeat(NamingProxy.java:227)
at com.alibaba.nacos.client.naming.beat.BeatReactor$BeatTask.run(BeatReactor.java:109)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

并且看到221机器的 naming-ephemeral.log这个日志一直在刷这个信息:
2019-05-20 12:24:41,829 INFO waiting server list init...

2019-05-20 12:24:42,829 INFO waiting server list init...

2019-05-20 12:24:43,829 INFO waiting server list init...

2019-05-20 12:24:44,829 INFO waiting server list init...

2019-05-20 12:24:45,175 DEBUG sync checksums: {com.alibaba.nacos.naming.iplist.ephemeral.a74cd717-d6b8-4a76-a7ca-0c601793c6d6##DEFAULT_GROUP@@service-feign=621f8b50571ba2f64a22d8a4c728fcc}

2019-05-20 12:24:45,829 INFO waiting server list init...

2019-05-20 12:24:46,829 INFO waiting server list init...

2019-05-20 12:24:47,830 INFO waiting server list init...

2019-05-20 12:24:48,830 INFO waiting server list init...

2019-05-20 12:24:49,830 INFO waiting server list init...

2019-05-20 12:24:50,176 DEBUG sync checksums: {com.alibaba.nacos.naming.iplist.ephemeral.a74cd717-d6b8-4a76-a7ca-0c601793c6d6##DEFAULT_GROUP@@service-feign=621f8b50571ba2f64a22d8a4c728fcc}

2019-05-20 12:24:50,830 INFO waiting server list init...

2019-05-20 12:24:51,830 INFO waiting server list init...

2019-05-20 12:24:52,830 INFO waiting server list init...

2019-05-20 12:24:53,831 INFO waiting server list init...

2019-05-20 12:24:54,831 INFO waiting server list init...

2019-05-20 12:24:55,176 DEBUG sync checksums: {com.alibaba.nacos.naming.iplist.ephemeral.a74cd717-d6b8-4a76-a7ca-0c601793c6d6##DEFAULT_GROUP@@service-feign=621f8b50571ba2f64a22d8a4c728fcc}

2019-05-20 12:24:55,831 INFO waiting server list init...

@nkorange
Copy link
Collaborator

nkorange commented May 20, 2019

@dolcevitaforever 三台机器的cluster.conf内容一样吗?naming-raft.log看看是否有报错,重启221是否能解决?

@dolcevitaforever
Copy link

你好, @nkorange 重启221之后,还是一样的错误
三个机器cluster.conf的配置信息都是一样的呢:
#it is ip
192.168.1.99:8848
192.168.1.220:8848
192.168.1.221:8848

重启了221之后,看到221的naming-raft.log中有这个信息:
2019-05-20 12:30:49,850 INFO add listener: com.alibaba.nacos.naming.domains.meta.

2019-05-20 12:30:50,721 INFO add listener: com.alibaba.nacos.naming.domains.meta.00-00---000-NACOS_SWITCH_DOMAIN-000---00-00

2019-05-20 12:30:54,617 INFO 192.168.1.220:8848 has become the LEADER, local: {"heartbeatDueMs":5000,"ip":"192.168.1.221:8848","leaderDueMs":16491,"state":"FOLLOWER","term":0,"voteFor":""}, leader: {"heartbeatDueMs":5000,"ip":"192.168.1.220:8848","leaderDueMs":19496,"state":"LEADER","term":38,"voteFor":"192.168.1.220:8848"}

99的naming-raft.log中:
2019-05-20 10:57:32,434 INFO 192.168.1.221:8848 has become the LEADER, local: {"heartbeatDueMs":5000,"ip":"192.168.1.99:8848","leaderDueMs":16972,"state":"FOLLOWER","term":0,"voteFor":""}, leader: {"heartbeatDueMs":5000,"ip":"192.168.1.221:8848","leaderDueMs":15942,"state":"LEADER","term":37,"voteFor":"192.168.1.221:8848"}

2019-05-20 10:57:32,434 INFO [RAFT] received beat with 0 keys, RaftCore.datums' size is 0, remote server: 192.168.1.221:8848, term: 37, local term: 0

2019-05-20 10:57:32,534 INFO raft peers changed: [{"adWeight":0,"alive":false,"ip":"192.168.1.220","key":"192.168.1.220:8848","lastRefTime":0,"lastRefTimeStr":"","servePort":8848,"site":"unknown","weight":1}, {"adWeight":0,"alive":false,"ip":"192.168.1.221","key":"192.168.1.221:8848","lastRefTime":0,"lastRefTimeStr":"","servePort":8848,"site":"unknown","weight":1}, {"adWeight":0,"alive":false,"ip":"192.168.1.99","key":"192.168.1.99:8848","lastRefTime":0,"lastRefTimeStr":"","servePort":8848,"site":"unknown","weight":1}]

我这个集群应该是启动成功的,可是现在服务注册的时候, 控制台还是221节点的503错误:

at com.alibaba.nacos.client.naming.net.NamingProxy.callServer(NamingProxy.java:340)
at com.alibaba.nacos.client.naming.net.NamingProxy.reqAPI(NamingProxy.java:367)
at com.alibaba.nacos.client.naming.net.NamingProxy.reqAPI(NamingProxy.java:304)
at com.alibaba.nacos.client.naming.net.NamingProxy.queryList(NamingProxy.java:217)
at com.alibaba.nacos.client.naming.core.HostReactor.updateServiceNow(HostReactor.java:273)
at com.alibaba.nacos.client.naming.core.HostReactor$UpdateTask.run(HostReactor.java:318)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2019-05-20 12:34:26.943 [com.alibaba.nacos.client.naming.updater] ERROR com.alibaba.nacos.client.naming -
request 192.168.1.221:8848 failed.
com.alibaba.nacos.api.exception.NacosException: failed to req API:http://192.168.1.221:8848/nacos/v1/ns/instance/list. code:503 msg: server is DOWN now, please try again later!
at com.alibaba.nacos.client.naming.net.NamingProxy.callServer(NamingProxy.java:340)
at com.alibaba.nacos.client.naming.net.NamingProxy.reqAPI(NamingProxy.java:367)
at com.alibaba.nacos.client.naming.net.NamingProxy.reqAPI(NamingProxy.java:304)
at com.alibaba.nacos.client.naming.net.NamingProxy.queryList(NamingProxy.java:217)
at com.alibaba.nacos.client.naming.core.HostReactor.updateServiceNow(HostReactor.java:273)
at com.alibaba.nacos.client.naming.core.HostReactor$UpdateTask.run(HostReactor.java:318)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

@dolcevitaforever
Copy link

@nkorange 99的日志贴错了,是这段
2019-05-20 12:27:33,825 INFO received approve from peer: {"heartbeatDueMs":3500,"ip":"192.168.1.220:8848","leaderDueMs":140,"state":"FOLLOWER","term":37,"voteFor":"192.168.1.221:8848"}

2019-05-20 12:27:33,885 INFO vote 192.168.1.220:8848 as leader, term: 38

2019-05-20 12:27:38,849 INFO 192.168.1.220:8848 has become the LEADER, local: {"heartbeatDueMs":5000,"ip":"192.168.1.99:8848","leaderDueMs":18519,"state":"FOLLOWER","term":38,"voteFor":"192.168.1.220:8848"}, leader: {"heartbeatDueMs":5000,"ip":"192.168.1.220:8848","leaderDueMs":18222,"state":"LEADER","term":38,"voteFor":"192.168.1.220:8848"}

2019-05-20 12:27:38,849 INFO [RAFT] received beat with 0 keys, RaftCore.datums' size is 0, remote server: 192.168.1.220:8848, term: 38, local term: 38

@nkorange
Copy link
Collaborator

@dolcevitaforever 在220上执行

curl '127.0.0.1:8848/nacos/v1/ns/operator/servers?healthy=true' -H 'User-Agent:Nacos-Server'

看看返回结果是什么。

@dolcevitaforever
Copy link

@nkorange 你好, 220上执行 返回结果是:
[root@stackmaster logs]# curl '127.0.0.1:8848/nacos/v1/ns/operator/servers?healthy=true' -H 'User-Agent:Nacos-Server'
{"servers":[{"ip":"192.168.1.220","servePort":8848,"site":"unknown","weight":4,"adWeight":0,"alive":true,"lastRefTime":1558325952859,"lastRefTimeStr":"2019-05-20 12:19:12","key":"192.168.1.220:8848"},{"ip":"192.168.1.221","servePort":8848,"site":"unknown","weight":2,"adWeight":0,"alive":true,"lastRefTime":1558326120274,"lastRefTimeStr":"2019-05-20 12:22:00","key":"192.168.1.221:8848"},{"ip":"192.168.1.99","servePort":8848,"site":"unknown","weight":3,"adWeight":0,"alive":true,"lastRefTime":1558325965228,"lastRefTimeStr":"2019-05-20 12:19:25","key":"192.168.1.99:8848"}]}
[root@stackmaster logs]#

@nkorange
Copy link
Collaborator

nkorange commented May 20, 2019

@dolcevitaforever 这是个bug,参考 #1091 ,1.0.1会发布修复这个问题。当前你可以基于master分支运行命令:

 mvn -Prelease-nacos clean install -U

来构建最新修复包,构建后包的位置在distribution/target/下

@dolcevitaforever
Copy link

好的, thanks ` @nkorange

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants