Description
Is this a request for help? (If yes, you should use our troubleshooting guide and community support channels, see http://kubernetes.io/docs/troubleshooting/.):
No.
What keywords did you search in Kubernetes issues before filing this one? (If you have found any duplicates, you should instead reply there.):
CLOSE_WAIT
Is this a BUG REPORT or FEATURE REQUEST? (choose one):
BUG REPORT
Kubernetes version (use kubectl version
):
Client Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.4", GitCommit:"7243c69eb523aa4377bce883e7c0dd76b84709a1", GitTreeState:"clean", BuildDate:"2017-03-08T02:48:58Z", GoVersion:"go1.8", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.2", GitCommit:"08e099554f3c31f6e6f07b448ab3ed78d0520507", GitTreeState:"clean", BuildDate:"2017-01-12T04:52:34Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"linux/amd64"}
Environment:
- Cloud provider or hardware configuration: AWS
- OS (e.g. from /etc/os-release): Debian GNU/Linux 8 (jessie)
- Kernel (e.g.
uname -a
): Linux ip-172-31-64-14 4.4.41-k8s Unit test coverage in Kubelet is lousy. (~30%) #1 SMP Mon Jan 9 15:34:39 UTC 2017 x86_64 GNU/Linux - Install tools: kops 1.5.3
- Others:
What happened:
The problem was triggered when an Service type=LoadBalancer was left without ready Pods. This triggers a wave of CLOSE_WAIT on the master node(s) that is reproducible.
What you expected to happen:
There should not be any flooding of CLOSE_WAIT connections.
How to reproduce it (as minimally and precisely as possible):
- Start a cluster with Kops v1.5.3 (kubernetes v1.5.2) at AWS
- Create a Service type=LoadBalancer (without attached Pods)
This should trigger the CLOSE_WAIT on the master.
Take note that because the kops v1.5.3 is using taints instead of SchedulingDisabled (kubernetes/kops#639) the master nodes are also added under the ELB on AWS.
Anything else we need to know:
- As a workaround the master can be tagged as unscheduled. This will configure ELB to exclude the master and the CLOSE_WAITs will stop raising.
kubectl patch node MASTER_NAME -p "{\"spec\":{\"unschedulable\":true}}"
-
If Pods are added to the LoadBalancer service, the CLOSE_WAITS will stop raising when ready.
-
The CLOSE_WAITs start raising on master when ELB is tagging the master node as "InService" not before that.
-
Once too many CLOSE_WAITs are generated the following error appears, and master is marked as
not_ready
and ssh is unresponsive. Logs were gathered from "AWS > Instance Settings > Get System Log"
TCP: out of memory… consider tuning tcp_mem
- Issue has been reproduced in a different cluster and AWS account.
reported together with @mikim83
Activity
justinsb commentedon Mar 18, 2017
Thank you for the excellent report!
So my theory is this:
For 3: I confirmed that during the time a service had no pods, the
KUBE-SVC-UV4XIKEQLMZEPCEV
(in my case) was removed entirely...And kube-proxy was listening on 31445 (the NodePort):
Also, the number of CLOSE_WAIT connections went up during the time period when I restarted the pod in my service.
I confirmed that 172.20.104.161 and 172.20.108.40 are the IP addresses of my ELBs. They are doing TCP health checks every 5s (IIRC). It is also possible that the health check is an unusual TCP pattern (because it is not an HTTP health check; it merely opens and closes the connection).
For this particular issue, which was about the master, my suspicion is that the same will happen on the nodes, in that the kube-proxy configurations should be the same. If we actually know that this does not happen on the nodes, that is interesting information.
Two possible fixes spring to mind:
A) Add a rule when there are no Pods on the NodePort that rejects the connection. Efficient. But: iptables is never easy. Also I don't know if this will cause health checks to fail, which isn't wrong but would slow down recovery.
B) Have kube-proxy accept connections and immediately close them. This feels correct, though I think it will consume a goroutine per nodeport. But goroutines are cheap.
cc @felipejfc as this looks similar to what you are reporting in #41640
cc @thockin for kube-proxy guru-ness and advice on which option to pursue
felipejfc commentedon Mar 18, 2017
I actually had 2 namespaces with 3 services each(ELB type) that had no pods associated because someone forgotten do delete them after deleting the pods and they had healthchecks configured, after deleting the services today we've seen a massive networking performance boost (thousands of sockets in CLOSE_WAIT state got closed).
I'll look for other services with no pod associated and delete them and will keep an eye on the cluster
Thanks for helping @justinsb !!
Merge pull request #43415 from thockin/fix-nodeport-close-wait
thockin commentedon May 11, 2017
@justinsb do you recall if we ported this back to 1.6?
justinsb commentedon May 11, 2017
@thockin looks like we got the first one into 1.6, but not the second one :-(
These are the two commits (for some reason github only shows the branches in this view, not the PR view...)
2ec8799
9a423b6
I did reopen the cherry-pick for the 1st to 1.5 this morning (I've been getting pings on this issue): #43858
Looks like we should get #43858 in, and then cherry-pick 9a423b6 to 1.5 and 1.6.
I do recommend to people hitting this in the real world that they remove services without endpoints - it is almost always just an error/oversight. I don't think it's a huge problem to leak a few connections on a restart if you happen to end up with no pods for a ~minute. Also, typically this saves the cost of an extra ELB.
exarkun commentedon Jul 17, 2017
What version of Kubernetes is it expected that this issue is resolved in? The problem still manifests on my Kubernetes 1.6.3 deployment.
0xMadao commentedon Sep 28, 2017
still on kubernetes 1.6.4 on aws, service which type is load balancer in our cluster are all pods-associated, but still have thousands of CLOSE_WAIT