Closed
Description
What happened:
kubectl exec
fails or times out:
Connection refused:
kubectl exec -it -n my-ns my-pod sh
error: unable to upgrade connection: error dialing backend: dial tcp 127.0.0.1:37751: connect: connection refused
Timout:
kubectl -n my-ns exec -it my-pod -v 99 bash
...
I1118 10:11:32.945560 9992 round_trippers.go:419] curl -k -v -XPOST -H "X-Stream-Protocol-Version: v4.channel.k8s.io" -H "X-Stream-Protocol-Version: v3.channel.k8s.io" -H "X-Stream-Protocol-Version: v2.channel.k8s.io" -H "
X-Stream-Protocol-Version: channel.k8s.io" -H "User-Agent: kubectl.exe/v1.15.3 (windows/amd64) kubernetes/2d3c76f" -H "Authorization: Bearer kubeconfig-u-ob5wqxfcaq:fc5gvmsxt2j5z8s227gk5t9v7f5rc9hlc7fqpxv8tnm56g8lbjnws2" 'http
s://api-sever.mycorp.com/k8s/clusters/c-9skrw/api/v1/namespaces/my-ns/pods/my-pod/exec?command=bash&container=minio&stdin=true&stdout=true&tty=true'
I1118 10:13:43.699847 9992 round_trippers.go:438] POST https://api-sever.mycorp.com/k8s/clusters/c-9skrw/api/v1/namespaces/my-ns/pods/my-pod/exec?command=bash&container=minio&stdin=true&stdout
=true&tty=true 500 Internal Server Error in 130752 milliseconds
I1118 10:13:43.712034 9992 round_trippers.go:444] Response Headers:
I1118 10:13:43.712034 9992 round_trippers.go:447] Server: openresty/1.15.8.1
I1118 10:13:43.712034 9992 round_trippers.go:447] Date: Mon, 18 Nov 2019 09:13:43 GMT
I1118 10:13:43.713032 9992 round_trippers.go:447] Content-Type: text/plain; charset=utf-8
I1118 10:13:43.713032 9992 round_trippers.go:447] Content-Length: 79
I1118 10:13:43.713032 9992 round_trippers.go:447] Connection: keep-alive
I1118 10:13:43.714030 9992 round_trippers.go:447] X-Content-Type-Options: nosniff
I1118 10:13:43.714030 9992 round_trippers.go:447] Strict-Transport-Security: max-age=15724800; includeSubDomains
F1118 10:13:43.717022 9992 helpers.go:114] error: unable to upgrade connection: error dialing backend: dial tcp 127.0.0.1:32935: connect: connection timed out
What you expected to happen: kubectl exec
works...
How to reproduce it (as minimally and precisely as possible):
- kube-proxy runs in ipvs mode
- api-server config: service-node-port-range: 30000-39999
- kubelet starts the docker shim streaming server in the NodePort range (here 127.0.0.1:32935):
root@kubedev-worker-8b005e396435:~# netstat -anp | grep kubelet
tcp 0 0 127.0.0.1:32935 0.0.0.0:* LISTEN 2419/kubelet
tcp 0 0 127.0.0.1:10248 0.0.0.0:* LISTEN 2419/kubelet
tcp 0 0 0.0.0.0:10250 0.0.0.0:* LISTEN 2419/kubelet
- create a service with the same NodePort the streaming server uses (here 32935)
- Wait a few seconds so kube-proxy syncs then try to run
kubectl exec
with a pod on that node (kubedev-worker-8b005e396435). - Error:
error: unable to upgrade connection: error dialing backend: dial tcp 127.0.0.1:32935: connect: connection refused
Anything else we need to know?:
I'm not sure how to reproduce the kubectl exec
connection timed out problem but I observed that the kubelet streaming server was using a existing NodePort in this case as well. Maybe this happens after a reboot when the kubelet starts before the kube-proxy and the kubelet uses a NodePort that is already used...
Seems to me that the streaming server uses a random port that doesn't take into account the NodePort Range:
kubernetes/pkg/kubelet/kubelet.go
Line 2294 in 4c50ee9
Maybe an option to specify the streaming server port would fix it?
Environment:
- Kubernetes version (use
kubectl version
): v1.15.5 kubelet and api-server - Cloud provider or hardware configuration:
- OS (e.g:
cat /etc/os-release
): Ubuntu 18.04.3 LTS - Kernel (e.g.
uname -a
): 5.0.0-31-generic - Install tools: kubeadm, kubectl
- Network plugin and version (if this is a network-related bug): canal with calico v3.10
- Others:
Metadata
Metadata
Assignees
Labels
Type
Projects
Milestone
Relationships
Development
No branches or pull requests
Activity
yvespp commentedon Nov 18, 2019
/sig node
yvespp commentedon Nov 19, 2019
Fixed be setting sysctl
net.ipv4.ip_local_port_range = 40000 60999
Maybe this should be documented somewhere?
gongguan commentedon Nov 19, 2019
I wonder why you could use 32935 or 37751 as nodePort.
There is a
DefaultServiceNodePortRange
: 30000-32767yvespp commentedon Nov 19, 2019
You can set a custom node-port-range in api-server which we did:
service-node-port-range: 30000-39999
fejta-bot commentedon Feb 17, 2020
Issues go stale after 90d of inactivity.
Mark the issue as fresh with
/remove-lifecycle stale
.Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with
/close
.Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
fejta-bot commentedon Mar 18, 2020
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with
/remove-lifecycle rotten
.Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with
/close
.Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
fejta-bot commentedon Apr 17, 2020
Rotten issues close after 30d of inactivity.
Reopen the issue with
/reopen
.Mark the issue as fresh with
/remove-lifecycle rotten
.Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close
k8s-ci-robot commentedon Apr 17, 2020
@fejta-bot: Closing this issue.
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
k8s-ci-robot commentedon Jun 17, 2020
@wktmeow: You can't reopen an issue/PR unless you authored it or you are a collaborator.
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
fl-max commentedon Jan 19, 2021
For anyone else stumbling upon this issue with the error
error: unable to upgrade connection: error dialing backend: dial tcp 127.0.0.1:<port>
, the problem for me was that the loopback device was never started (ifconfig lo
). Simply runningifup lo
fixed this issue for me.JasonRD commentedon Aug 6, 2021
Had the same problem. In the cluster of someone cloud vendor, which set apiserver with option --service-node-port-range=30000-50000, the streaming server startup with port 32859, it conflicted with nodeport of one service.
AFAIK, the 'redirect-container-streaming' options will disable streaming server, but it had been removed from v1.20.
So, in the case, set option service-node-port-range with 30000-50000, the probability of confliction will be increases, with the number of nodeport increases. @gongguan
If i read it right,issue #100643 is disccussing it.
JasonRD commentedon Aug 15, 2021
/reopen
k8s-ci-robot commentedon Aug 15, 2021
@JasonRD: You can't reopen an issue/PR unless you authored it or you are a collaborator.
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
amine250 commentedon Aug 30, 2021
/reopen
k8s-ci-robot commentedon Aug 30, 2021
@amine250: You can't reopen an issue/PR unless you authored it or you are a collaborator.
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
aojea commentedon Sep 5, 2021
dockershim has been deprecated so this issue is not likely to be reopen, if you have another issue then please open a new issue with all the details