Skip to content

kubectl get componentstatus and etcd from unix socket #70741

Closed
@ktsakalozos

Description

@ktsakalozos

What happened:
microk8s.kubectl get componentstatus reports etcd not healthy:

> microk8s.kubectl get componentstatuses                                                          
NAME                 STATUS      MESSAGE                                                                                           ERROR      
etcd-0               Unhealthy   Get http://etcd.socket:2379/health: dial tcp: lookup etcd.socket on 127.0.0.53:53: no such host              
controller-manager   Healthy     ok                                                                                                           
scheduler            Healthy     ok            

What you expected to happen:
The cluster is healthy. I have no issue using the cluster. No problems to report, therefore I would expect:

> microk8s.kubectl get componentstatuses
NAME                 STATUS    MESSAGE             ERROR
controller-manager   Healthy   ok
scheduler            Healthy   ok
etcd-0               Healthy   {"health":"true"}

How to reproduce it (as minimally and precisely as possible):

sudo snap install microk8s --classic
microk8s.kubectl get componentstatuses

Anything else we need to know?:
The problem is with etcd configured to listen to a unix socket and not a tcp port. You can see the etcd configuration under /var/snap/microk8s/current/args/etcd (and also here https://github.com/ubuntu/microk8s/blob/master/microk8s-resources/default-args/etcd). The error says that etcd-0 is not available on http://.... which is true since it is available at unix://...

You can reconfigure etcd and the api server to talk over a tcp port. This way the component status reports everything as healthy. Update /var/snap/microk8s/current/args/etcd with

--advertise-client-urls=http://localhost:2379
--listen-client-urls=http://localhost:2379

Update /var/snap/microk8s/current/args/kube-apiserver with:

--etcd-servers='http://localhost:2379'

Restart the two services:

sudo systemctl restart snap.microk8s.daemon-etcd
sudo systemctl restart snap.microk8s.daemon-apiserver.service

Now microk8s.kubectl get componentstatuses reports everything is healthy

Environment: Linux

  • Kubernetes version (use kubectl version): 1.12
  • Cloud provider or hardware configuration: local machine
  • OS (e.g. from /etc/os-release): NAME="Ubuntu" VERSION="18.04.1 LTS (Bionic Beaver)"
  • Kernel (e.g. uname -a): 4.15.0-34-generic
  • Install tools: snap
  • Others:

/kind bug

Activity

added
kind/bugCategorizes issue or PR as related to a bug.
needs-sigIndicates an issue or PR lacks a `sig/foo` label and requires one.
on Nov 7, 2018
dims

dims commented on Nov 7, 2018

@dims
Member

please report in the microk8s repo - https://github.com/ubuntu/microk8s

/close

k8s-ci-robot

k8s-ci-robot commented on Nov 7, 2018

@k8s-ci-robot
Contributor

@dims: Closing this issue.

In response to this:

please report in the microk8s repo - https://github.com/ubuntu/microk8s

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

ktsakalozos

ktsakalozos commented on Nov 7, 2018

@ktsakalozos
ContributorAuthor

Hi @dims, I appreciate your prompt reply.

This is does not seem a microk8s issue. If you configure etcd to listen on a unix socket and point the apiserver to use that socket, etcd does not register as healthy. The use of unix sockets between apiserver and etcd is valid, right? Are you suggesting there is a bug in the configuration of microk8s? You want me to reproduce this bug on another distribution?

dims

dims commented on Nov 7, 2018

@dims
Member

@ktsakalozos that would be great! since very few of us are familiar with how things are setup under microk8s. if you can just use the latest upstream release that would be even better.

from your report if i read it correctly currently kube-apiserver is not able to talk to etcd when etcd is listening on a unix socket and the default implementation is unable to connect to etcd and hence the failures. not sure if this is to be taken as a bug or a feature request :)

/sig api-machinery

added
sig/api-machineryCategorizes an issue or PR as relevant to SIG API Machinery.
and removed
needs-sigIndicates an issue or PR lacks a `sig/foo` label and requires one.
on Nov 7, 2018
ktsakalozos

ktsakalozos commented on Nov 7, 2018

@ktsakalozos
ContributorAuthor

kube-apiserver is not able to talk to etcd when etcd is listening on a unix socket

No. The kube-apiserver can talk to etcd with no problem. The cluster is operational, I do not see any issue with the cluster. It is only get componentstatus that falsely reports that etcd is unavailable. It seems that the case where etcd is available over a unix:// socket is not taken into account when probing etcd.

not sure if this is to be taken as a bug or a feature request

I hope things are more clear now. This is not a feature request.

dims

dims commented on Nov 7, 2018

@dims
Member

@ktsakalozos do you see any errors in the api server logs when you run get componentstatus?

ktsakalozos

ktsakalozos commented on Nov 7, 2018

@ktsakalozos
ContributorAuthor

I do no see any errors on the API server. Here is a a portion of the log:

Nov 07 16:44:13 jackal-VGN-FZ11M microk8s.daemon-apiserver[17264]: I1107 16:44:13.361732   17264 wrap.go:42] PUT /api/v1/namespaces/kube-system/endpoints/kube-controller-manager?timeout=10s: (1.494985ms) 200 [kube-controller-manager/v1.1
2.2 (linux/amd64) kubernetes/17c77c7/leader-election 127.0.0.1:35152]
Nov 07 16:44:14 jackal-VGN-FZ11M microk8s.daemon-apiserver[17264]: I1107 16:44:14.166036   17264 handler.go:153] kube-aggregator: GET "/api/v1/componentstatuses" satisfied by nonGoRestful
Nov 07 16:44:14 jackal-VGN-FZ11M microk8s.daemon-apiserver[17264]: I1107 16:44:14.166319   17264 pathrecorder.go:247] kube-aggregator: "/api/v1/componentstatuses" satisfied by prefix /api/
Nov 07 16:44:14 jackal-VGN-FZ11M microk8s.daemon-apiserver[17264]: I1107 16:44:14.166471   17264 handler.go:143] kube-apiserver: GET "/api/v1/componentstatuses" satisfied by gorestful with webservice /api/v1
Nov 07 16:44:14 jackal-VGN-FZ11M microk8s.daemon-apiserver[17264]: I1107 16:44:14.168528   17264 http.go:96] Probe succeeded for http://127.0.0.1:10252/healthz, Response: {200 OK 200 HTTP/1.1 1 1 map[Date:[Wed, 07 Nov 2018 14:44:14 GMT] 
Content-Length:[2] Content-Type:[text/plain; charset=utf-8]] 0xc427e0a340 2 [] true false map[] 0xc428c5b100 <nil>}
Nov 07 16:44:14 jackal-VGN-FZ11M microk8s.daemon-apiserver[17264]: I1107 16:44:14.169257   17264 http.go:96] Probe succeeded for http://127.0.0.1:10251/healthz, Response: {200 OK 200 HTTP/1.1 1 1 map[Date:[Wed, 07 Nov 2018 14:44:14 GMT] 
Content-Length:[2] Content-Type:[text/plain; charset=utf-8]] 0xc427e0a440 2 [] true false map[] 0xc428c5b300 <nil>}
Nov 07 16:44:14 jackal-VGN-FZ11M microk8s.daemon-apiserver[17264]: I1107 16:44:14.169658   17264 wrap.go:42] GET /api/v1/componentstatuses?limit=500: (3.759774ms) 200 [kubectl/v1.12.2 (linux/amd64) kubernetes/17c77c7 127.0.0.1:35290]
Nov 07 16:44:15 jackal-VGN-FZ11M microk8s.daemon-apiserver[17264]: I1107 16:44:15.348317   17264 handler.go:153] kube-aggregator: GET "/api/v1/namespaces/kube-system/endpoints/kube-scheduler" satisfied by nonGoRestful
Nov 07 16:44:15 jackal-VGN-FZ11M microk8s.daemon-apiserver[17264]: I1107 16:44:15.348350   17264 pathrecorder.go:247] kube-aggregator: "/api/v1/namespaces/kube-system/endpoints/kube-scheduler" satisfied by prefix /api/
Nov 07 16:44:15 jackal-VGN-FZ11M microk8s.daemon-apiserver[17264]: I1107 16:44:15.348383   17264 handler.go:143] kube-apiserver: GET "/api/v1/namespaces/kube-system/endpoints/kube-scheduler" satisfied by gorestful with webservice /api/v1
Nov 07 16:44:15 jackal-VGN-FZ11M microk8s.daemon-apiserver[17264]: I1107 16:44:15.349262   17264 wrap.go:42] GET /api/v1/namespaces/kube-system/endpoints/kube-scheduler?timeout=10s: (992.454µs) 200 [kube-scheduler/v1.12.2 (linux/amd64) k
ubernetes/17c77c7/leader-election 127.0.0.1:44138]

And here is the invocation:

> microk8s.kubectl get componentstatus -v=8
I1107 16:57:53.577822   27540 loader.go:359] Config loaded from file /snap/microk8s/266/client.config
I1107 16:57:53.578234   27540 loader.go:359] Config loaded from file /snap/microk8s/266/client.config
I1107 16:57:53.579997   27540 loader.go:359] Config loaded from file /snap/microk8s/266/client.config
I1107 16:57:53.582973   27540 loader.go:359] Config loaded from file /snap/microk8s/266/client.config
I1107 16:57:53.583272   27540 round_trippers.go:383] GET http://127.0.0.1:8080/api/v1/componentstatuses?limit=500
I1107 16:57:53.583289   27540 round_trippers.go:390] Request Headers:
I1107 16:57:53.583298   27540 round_trippers.go:393]     Accept: application/json;as=Table;v=v1beta1;g=meta.k8s.io, application/json
I1107 16:57:53.583307   27540 round_trippers.go:393]     User-Agent: kubectl/v1.12.2 (linux/amd64) kubernetes/17c77c7
I1107 16:57:53.585873   27540 round_trippers.go:408] Response Status: 200 OK in 2 milliseconds
I1107 16:57:53.586105   27540 round_trippers.go:411] Response Headers:
I1107 16:57:53.586120   27540 round_trippers.go:414]     Content-Type: application/json
I1107 16:57:53.586131   27540 round_trippers.go:414]     Date: Wed, 07 Nov 2018 14:57:53 GMT
I1107 16:57:53.586138   27540 round_trippers.go:414]     Content-Length: 736
I1107 16:57:53.586213   27540 request.go:942] Response Body: {"kind":"ComponentStatusList","apiVersion":"v1","metadata":{"selfLink":"/api/v1/componentstatuses"},"items":[{"metadata":{"name":"etcd-0","selfLink":"/api/v1/componentstatuses/etcd-0","creationTimestamp":null},"conditions":[{"type":"Healthy","status":"False","message":"Get http://etcd.socket:2379/health: dial tcp: lookup etcd.socket on 127.0.0.53:53: no such host"}]},{"metadata":{"name":"scheduler","selfLink":"/api/v1/componentstatuses/scheduler","creationTimestamp":null},"conditions":[{"type":"Healthy","status":"True","message":"ok"}]},{"metadata":{"name":"controller-manager","selfLink":"/api/v1/componentstatuses/controller-manager","creationTimestamp":null},"conditions":[{"type":"Healthy","status":"True","message":"ok"}]}]}
I1107 16:57:53.586877   27540 loader.go:359] Config loaded from file /snap/microk8s/266/client.config
I1107 16:57:53.587283   27540 loader.go:359] Config loaded from file /snap/microk8s/266/client.config
I1107 16:57:53.587681   27540 loader.go:359] Config loaded from file /snap/microk8s/266/client.config
I1107 16:57:53.587915   27540 get.go:474] Unable to decode server response into a Table. Falling back to hardcoded types: attempt to decode non-Table object into a v1beta1.Table
I1107 16:57:53.587929   27540 get.go:474] Unable to decode server response into a Table. Falling back to hardcoded types: attempt to decode non-Table object into a v1beta1.Table
I1107 16:57:53.587939   27540 get.go:474] Unable to decode server response into a Table. Falling back to hardcoded types: attempt to decode non-Table object into a v1beta1.Table
NAME                 STATUS      MESSAGE                                                                                           ERROR
etcd-0               Unhealthy   Get http://etcd.socket:2379/health: dial tcp: lookup etcd.socket on 127.0.0.53:53: no such host   
scheduler            Healthy     ok                                                                                                
controller-manager   Healthy     ok                                                                  

The logs were here journalctl -u snap.microk8s.daemon-apiserver.service after adding a -v=9 in
/var/snap/microk8s/current/args/kube-apiserver.

okrause

okrause commented on Nov 27, 2018

@okrause

microk8s.daemon-etcd[1463]: WARNING: 2018/11/27 12:31:06 grpc: addrConn.resetTransport failed to create client transport: connection error: desc = "transport: Error while dialing dial tcp: lookup etcd.socket on 127.0.0.53:53: no such host"; Reconnecting to {etcd.socket:2379 0 }

Same problem here. microk8s.daemon-apiserver is using my systems DNS resolver.

$ sudo netstat -anp | grep '127.0.0.53:53'
tcp 0 0 127.0.0.53:53 0.0.0.0:* LISTEN 831/systemd-resolve
udp 51456 0 127.0.0.53:53 0.0.0.0:* 831/systemd-resolve

My systems DNS resolver doesn't know of "etcd.socket" host.

$ dig @127.0.0.53 etcd.socket

; <<>> DiG 9.11.3-1ubuntu1.3-Ubuntu <<>> @127.0.0.53 etcd.socket
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 42808
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 65494
;; QUESTION SECTION:
;etcd.socket. IN A

;; Query time: 0 msec
;; SERVER: 127.0.0.53#53(127.0.0.53)
;; WHEN: Tue Nov 27 12:39:37 CET 2018
;; MSG SIZE rcvd: 40

Is it supposed to ask the systems DNS resolver? I assume it is supposed to ask kube-dns.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugCategorizes issue or PR as related to a bug.sig/api-machineryCategorizes an issue or PR as relevant to SIG API Machinery.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @dims@okrause@ktsakalozos@k8s-ci-robot

        Issue actions

          kubectl get componentstatus and etcd from unix socket · Issue #70741 · kubernetes/kubernetes