kubectl get cs showing unhealthy statuses for controller-manager and scheduler on v1.18.6 clean install #93472

vainkop · 2020-07-27T13:49:50Z

What happened:

I've installed a cluster with kubeadm v1.18.6:
kubeadm init --v=6 --config=kubeadmin/kubeadm-init.yaml --upload-certs
cat kubeadm-init.yaml

---
apiVersion: kubeadm.k8s.io/v1beta1
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: "xxx.xxx.xxx.xxx"
---
apiVersion: kubeadm.k8s.io/v1beta1
kind: ClusterConfiguration
kubernetesVersion: stable
apiServer:
  certSANs:
  - xxx.xxx.xxx.xxx
  - yyy.yyy.yyy.yyy
  - zzz.zzz.zzz.zzz
  - 127.0.0.1
controlPlaneEndpoint: xxx.xxx.xxx.xxx
etcd:
  external:
    endpoints:
    - http://xxx.xxx.xxx.xxx:2379
    - http://yyy.yyy.yyy.yyy:2379
    - http://zzz.zzz.zzz.zzz:2379
networking:
  podSubnet: 10.244.0.0/16
  serviceSubnet: "10.96.0.0/12"
  dnsDomain: "cluster.local"

Out of the box I'm getting unhealthy statuses:
kubectl get cs

NAME                 STATUS      MESSAGE                                                                                     ERROR
controller-manager   Unhealthy   Get http://127.0.0.1:10252/healthz: dial tcp 127.0.0.1:10252: connect: connection refused   
scheduler            Unhealthy   Get http://127.0.0.1:10251/healthz: dial tcp 127.0.0.1:10251: connect: connection refused   
etcd-1               Healthy     {"health":"true"}                                                                           
etcd-2               Healthy     {"health":"true"}                                                                           
etcd-0               Healthy     {"health":"true"}

What you expected to happen:

NAME                 STATUS      MESSAGE                                                                                     ERROR
controller-manager   Healthy     {"health":"true"}
scheduler            Healthy     {"health":"true"}
etcd-1               Healthy     {"health":"true"}                                                                           
etcd-2               Healthy     {"health":"true"}                                                                           
etcd-0               Healthy     {"health":"true"}

How to reproduce it (as minimally and precisely as possible):
Deploy cluster with kubeadm v1.18.6 + Flannel
From any master node run:
kubectl get cs
Anything else we need to know?:

Environment:

Kubernetes version (use kubectl version): v1.18.6
Cloud provider or hardware configuration: Hetzner Cloud, Hetzner dedicated rootservers, Fornex Dedicated Servers
OS (e.g: cat /etc/os-release):
Debian GNU/Linux 10 (buster)
Kernel (e.g. uname -a):
4.19.0-9-amd64 Unit test coverage in Kubelet is lousy. (~30%) #1 SMP Debian 4.19.118-2+deb10u1 (2020-06-07) x86_64 GNU/Linux
Install tools:
kubeadm
Network plugin and version (if this is a network-related bug):
Latest Flannel
Other:
Default config after kubeadm init:
cat /etc/kubernetes/manifests/kube-controller-manager.yaml

apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    component: kube-controller-manager
    tier: control-plane
  name: kube-controller-manager
  namespace: kube-system
spec:
  containers:
  - command:
    - kube-controller-manager
    - --allocate-node-cidrs=true
    - --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf
    - --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf
    - --bind-address=127.0.0.1
    - --client-ca-file=/etc/kubernetes/pki/ca.crt
    - --cluster-cidr=10.244.0.0/16
    - --cluster-name=kubernetes
    - --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt
    - --cluster-signing-key-file=/etc/kubernetes/pki/ca.key
    - --controllers=*,bootstrapsigner,tokencleaner
    - --kubeconfig=/etc/kubernetes/controller-manager.conf
    - --leader-elect=true
    - --node-cidr-mask-size=24
    - --port=0
    - --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
    - --root-ca-file=/etc/kubernetes/pki/ca.crt
    - --service-account-private-key-file=/etc/kubernetes/pki/sa.key
    - --service-cluster-ip-range=10.96.0.0/12
    - --use-service-account-credentials=true
    image: k8s.gcr.io/kube-controller-manager:v1.18.6
    imagePullPolicy: IfNotPresent
    livenessProbe:
      failureThreshold: 8
      httpGet:
        host: 127.0.0.1
        path: /healthz
        port: 10257
        scheme: HTTPS
      initialDelaySeconds: 15
      timeoutSeconds: 15
    name: kube-controller-manager
    resources:
      requests:
        cpu: 200m
    volumeMounts:
    - mountPath: /etc/ssl/certs
      name: ca-certs
      readOnly: true
    - mountPath: /etc/ca-certificates
      name: etc-ca-certificates
      readOnly: true
    - mountPath: /usr/libexec/kubernetes/kubelet-plugins/volume/exec
      name: flexvolume-dir
    - mountPath: /etc/kubernetes/pki
      name: k8s-certs
      readOnly: true
    - mountPath: /etc/kubernetes/controller-manager.conf
      name: kubeconfig
      readOnly: true
    - mountPath: /usr/local/share/ca-certificates
      name: usr-local-share-ca-certificates
      readOnly: true
    - mountPath: /usr/share/ca-certificates
      name: usr-share-ca-certificates
      readOnly: true
  hostNetwork: true
  priorityClassName: system-cluster-critical
  volumes:
  - hostPath:
      path: /etc/ssl/certs
      type: DirectoryOrCreate
    name: ca-certs
  - hostPath:
      path: /etc/ca-certificates
      type: DirectoryOrCreate
    name: etc-ca-certificates
  - hostPath:
      path: /usr/libexec/kubernetes/kubelet-plugins/volume/exec
      type: DirectoryOrCreate
    name: flexvolume-dir
  - hostPath:
      path: /etc/kubernetes/pki
      type: DirectoryOrCreate
    name: k8s-certs
  - hostPath:
      path: /etc/kubernetes/controller-manager.conf
      type: FileOrCreate
    name: kubeconfig
  - hostPath:
      path: /usr/local/share/ca-certificates
      type: DirectoryOrCreate
    name: usr-local-share-ca-certificates
  - hostPath:
      path: /usr/share/ca-certificates
      type: DirectoryOrCreate
    name: usr-share-ca-certificates
status: {}

P.S. I know, that insecure ports 10251 & 10252 are deprecated, but why are they used in the out of the box config & how to properly "fix" it?

@k8s-ci-robot /sig api-machinery
@k8s-ci-robot /wg component-standard

The text was updated successfully, but these errors were encountered:

neolit123 · 2020-07-27T16:48:24Z

please see here:
#93342

cs should be deprecated eventually.

/close

k8s-ci-robot · 2020-07-27T16:48:38Z

@neolit123: Closing this issue.

In response to this:

please see here:
#93342

cs should be deprecated eventually.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

neolit123 · 2020-07-27T16:49:21Z

/sig api-machinery

neolit123 · 2020-07-27T16:49:28Z

/sig cluster-lifecycle

CzarMich · 2020-08-19T14:06:58Z

Getting same issues
~$ kubectl get cs
NAME STATUS MESSAGE ERROR
scheduler Unhealthy Get http://127.0.0.1:10251/healthz: dial tcp 127.0.0.1:10251: connect: connection refused
controller-manager Unhealthy Get http://127.0.0.1:10252/healthz: dial tcp 127.0.0.1:10252: connect: connection refused
etcd-0 Healthy {"health":"true"}

~$ kubectl get componentstatus
NAME STATUS MESSAGE ERROR
controller-manager Unhealthy Get http://127.0.0.1:10252/healthz: dial tcp 127.0.0.1:10252: connect: connection refused
scheduler Unhealthy Get http://127.0.0.1:10251/healthz: dial tcp 127.0.0.1:10251: connect: connection refused
etcd-0 Healthy {"health":"true"} `

Anyway of how to solve this. or what is the work around?

vainkop · 2020-08-20T10:43:43Z

Getting same issues
~$ kubectl get cs
NAME STATUS MESSAGE ERROR
scheduler Unhealthy Get http://127.0.0.1:10251/healthz: dial tcp 127.0.0.1:10251: connect: connection refused
controller-manager Unhealthy Get http://127.0.0.1:10252/healthz: dial tcp 127.0.0.1:10252: connect: connection refused
etcd-0 Healthy {"health":"true"}

~$ kubectl get componentstatus
NAME STATUS MESSAGE ERROR
controller-manager Unhealthy Get http://127.0.0.1:10252/healthz: dial tcp 127.0.0.1:10252: connect: connection refused
scheduler Unhealthy Get http://127.0.0.1:10251/healthz: dial tcp 127.0.0.1:10251: connect: connection refused
etcd-0 Healthy {"health":"true"} `

Anyway of how to solve this. or what is the work around?

Stop using it as it's officially deprecated in 1.19: https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.19.md#deprecation

Kube-apiserver: the componentstatus API is deprecated. This API provided status of etcd, kube-scheduler, and kube-controller-manager components, but only worked when those components were local to the API server, and when kube-scheduler and kube-controller-manager exposed unsecured health endpoints. Instead of this API, etcd health is included in the kube-apiserver health check and kube-scheduler/kube-controller-manager health checks can be made directly against those components' health endpoints. (#93570, @liggitt) [SIG API Machinery, Apps and Cluster Lifecycle]

CardinS2U · 2020-12-29T09:06:30Z

how do you stop using the component?

a-glanville · 2021-08-07T21:55:08Z

It looks like you then you can remove the line "- --port=0" from
/etc/kubernetes/manifests/kube-scheduler.yaml and
/etc/kubernetes/manifests/kube-controller-manager.yaml
next restart kubectl and test it again.
I am running version v1.20.1 and this has resolved the error. The actual services do appear to be up and returning "ok" when queried.

CzarMich · 2021-08-08T08:47:50Z

unsubscribe

…

On Sat, Aug 7, 2021 at 11:55 PM Angus ***@***.***> wrote: It looks like you then you can remove the line "- --port=0" from / etc/kubernetes/manifests/kube-scheduler.yaml and /etc/kubernetes/manifests/kube-controller-manager.yaml next restart kubectl and test it again. I am running version v1.20.1 and this has resolved the error. The actual services do appear to be up and returning "ok" when queried. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#93472 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADO55VNFYOMQDURIGK55RX3T3WTURANCNFSM4PIZE5DQ> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email> .

-- Michael Anywar

mithlajkn · 2022-02-14T13:10:18Z

Getting same issue

$ kubectl get cs
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME                 STATUS      MESSAGE                                                                                       ERROR
scheduler            Unhealthy   Get "http://127.0.0.1:10251/healthz": dial tcp 127.0.0.1:10251: connect: connection refused   
controller-manager   Healthy     ok                                                                                            
etcd-0               Healthy     {"health":"true","reason":""}

vainkop added the kind/bug Categorizes issue or PR as related to a bug. label Jul 27, 2020

k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Jul 27, 2020

k8s-ci-robot closed this as completed Jul 27, 2020

k8s-ci-robot added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jul 27, 2020

k8s-ci-robot added the sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. label Jul 27, 2020

This was referenced Jul 30, 2020

Mark componentstatus as deprecated #93570

Merged

Deprecate ComponentStatus kubernetes/enhancements#553

Closed

24sama mentioned this issue Jun 17, 2022

scheduler、controller-manager Unhealthy kubesphere/kubekey#1340

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kubectl get cs showing unhealthy statuses for controller-manager and scheduler on v1.18.6 clean install #93472

kubectl get cs showing unhealthy statuses for controller-manager and scheduler on v1.18.6 clean install #93472

vainkop commented Jul 27, 2020 •

edited

neolit123 commented Jul 27, 2020

k8s-ci-robot commented Jul 27, 2020

neolit123 commented Jul 27, 2020

neolit123 commented Jul 27, 2020

CzarMich commented Aug 19, 2020

vainkop commented Aug 20, 2020 •

edited

CardinS2U commented Dec 29, 2020

a-glanville commented Aug 7, 2021 •

edited

CzarMich commented Aug 8, 2021 via email

mithlajkn commented Feb 14, 2022

kubectl get cs showing unhealthy statuses for controller-manager and scheduler on v1.18.6 clean install #93472

kubectl get cs showing unhealthy statuses for controller-manager and scheduler on v1.18.6 clean install #93472

Comments

vainkop commented Jul 27, 2020 • edited

neolit123 commented Jul 27, 2020

k8s-ci-robot commented Jul 27, 2020

neolit123 commented Jul 27, 2020

neolit123 commented Jul 27, 2020

CzarMich commented Aug 19, 2020

vainkop commented Aug 20, 2020 • edited

CardinS2U commented Dec 29, 2020

a-glanville commented Aug 7, 2021 • edited

CzarMich commented Aug 8, 2021 via email

mithlajkn commented Feb 14, 2022

vainkop commented Jul 27, 2020 •

edited

vainkop commented Aug 20, 2020 •

edited

a-glanville commented Aug 7, 2021 •

edited