Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kubelet does not delete evicted pods #55051

Closed
rfranzke opened this issue Nov 3, 2017 · 17 comments
Closed

Kubelet does not delete evicted pods #55051

rfranzke opened this issue Nov 3, 2017 · 17 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. sig/node Categorizes an issue or PR as relevant to SIG Node.

Comments

@rfranzke
Copy link
Contributor

rfranzke commented Nov 3, 2017

/kind feature

What happened:
Kubelet has evicted pods due to disk pressure. Eventually, the disk pressure went away and the pods were scheduled and started again, but the evicted pods remained in the list of pods (kubectl get pod --show-all).

What you expected to happen:
Wouldn't it be better if the kubelet would have deleted those evicted pods? The expected behaviour would therefore be to not see the evicted pods anymore, i.e. that they get deleted.

How to reproduce it (as minimally and precisely as possible):
Start kubelet with --eviction-hard and --eviction-soft with high thresholds or fill up the disk of a worker node.

Environment:

  • Kubernetes version (use kubectl version): 1.8.2
  • Cloud provider or hardware configuration: AWS
  • OS (e.g. from /etc/os-release): Container Linux 1465.7.0 (Ladybug)
  • Kernel (e.g. uname -a): 4.12.10-coreos
@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Nov 3, 2017
@rfranzke
Copy link
Contributor Author

rfranzke commented Nov 3, 2017

/sig node

@k8s-ci-robot k8s-ci-robot added the sig/node Categorizes an issue or PR as relevant to SIG Node. label Nov 3, 2017
@ghost
Copy link

ghost commented Nov 3, 2017

Evicted pods in 1.7.5 are a madness, the deletion of those pods are delayed by days!!!!, for example, I have a pod since 17 days was evicted an appear in the pod list:

mynamespace nginxrepo-2549817493-0p91t 0/1 Evicted 0 17d
mynamespace linting-umbrellabird-monocular-api 0/1 Evicted 0 17d

In the case of nginxrepo, the deployment does not exist anymore, but the pods are present in the list of pods as evicted!!!

Also delete pods that does not match node selector criteria:

nfs-3970785943-5rnn7 0/2 MatchNodeSelector 0 17d

After 17 days the pod appear in the list!!. This behavior affect for example in Grafana, because the pods appear in the list of available pods for monitoring, and of course, are evicted!!.

By the way @rfranzke this is not a feature request, this is an issue!!!! Please, could you re-tag the case?

Regards

@rfranzke
Copy link
Contributor Author

rfranzke commented Nov 3, 2017

/kind bug

@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Nov 3, 2017
@rfranzke
Copy link
Contributor Author

rfranzke commented Nov 3, 2017

/remove-kind feature

@k8s-ci-robot k8s-ci-robot removed the kind/feature Categorizes issue or PR as related to a new feature. label Nov 3, 2017
@ghost
Copy link

ghost commented Nov 3, 2017

Thank you @rfranzke

@liggitt
Copy link
Member

liggitt commented Nov 3, 2017

is this a duplicate of #54525

from #54525 (comment) it sounds like this is intentional, though I'm not sure what is expected to clean up pods in this case

@yujuhong
Copy link
Contributor

yujuhong commented Nov 3, 2017

though I'm not sure what is expected to clean up pods in this case

The PodGCController in the controller manager?

@krallistic
Copy link

krallistic commented Nov 7, 2017

A quick workaround we use, is to delete all evicted pods manually after an incident:
kubectl get pods --all-namespaces -ojson | jq -r '.items[] | select(.status.reason!=null) | select(.status.reason | contains("Evicted")) | .metadata.name + " " + .metadata.namespace' | xargs -n2 -l bash -c 'kubectl delete pods $0 --namespace=$1'
Not as nice as automatic delete, but it works. (Tested with 1.6.7, i heard in 1.7 you need to add --show-all)

@ghost
Copy link

ghost commented Nov 7, 2017

thank you @krallistic, I apply your workaround as cronjob a long time ago but is not a right way!
@liggitt is not duplicated, this is not the case when timestamp are not set properly for differents deployments (daemonset, statefulset, etc), here, appear that all of deployments does not have timestamp for deletion; and this affect the monitoring of the kubernetes environment!!

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 7, 2018
@kabakaev
Copy link

I suppose, this issue can be closed, because the evicted pods deletion can be controlled through settings in kube-controller-manager.

For those k8s users who hit the kube-apiserver or etcd performance issues due to too many evicted pods, i would recommend updating the kube-controller-manager config to set --terminated-pod-gc-threshold 100 or similar small value. The default GC threshold is 12500, which is too high for most etcd installations. Reading 12500 pod records from etcd takes seconds to complete.

Also ask yourself why are there so many evicted pods? Maybe your kube-scheduler keeps scheduling pods on a node which already reports DiskPressure or MemoryPressure? This could be the case if the kube-scheduler is configured with a custom --policy-config-file which has no CheckNodeMemoryPressure or CheckNodeDiskPressure in the list of policy predicates.

$ kube-controller-manager --help 2>&1|grep terminated
      --terminated-pod-gc-threshold int32                                 Number of terminated pods that can exist before the terminated pod garbage collector starts deleting terminated pods. If <= 0, the terminated pod garbage collector is disabled. (default 12500)

@rfranzke
Copy link
Contributor Author

Thanks @kabakaev for pointing that out. Didn't know that this can be configured. Let's close the ticket then.

@so0k
Copy link

so0k commented Feb 27, 2018

@kabakaev - wouldn't pod gc cover all pods (including terminated pods for other reasons) - what if we just want evicted pods to be cleaned up periodically?

@serg060606
Copy link

the issue still opened, there no reply for @so0k commented on Feb 27, 2018

@shelbaz
Copy link

shelbaz commented May 13, 2020

@so0k
I created a cron job using a Yaml file with this config (need to fix the formatting, check https://kubernetes.io/docs/tasks/job/automated-tasks-with-cron-jobs/) :



apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: delete-failed-pods
spec:
schedule: "*/30 * * * *"
failedJobsHistoryLimit: 1
successfulJobsHistoryLimit: 1
jobTemplate:
spec:
template:
spec:
containers:
- name: kubectl-runner
image: wernight/kubectl
command: ["sh", "-c", "kubectl get pods --all-namespaces --field-selector 'status.phase==Failed' -o json | kubectl delete -f -"]
restartPolicy: OnFailure


Create the task with kubectl create -f "PATH_TO_cronjob.yaml"

Check the status of the task with kubectl get cronjob delete-failed-pods

Delete the task with delete cronjob delete-failed-pods

@andyxning
Copy link
Member

andyxning commented Jun 14, 2021

Statefulset will auto delete Failed pod

if isFailed(replicas[i]) {
ssc.recorder.Eventf(set, v1.EventTypeWarning, "RecreatingFailedPod",
"StatefulSet %s/%s is recreating failed Pod %s",
set.Namespace,
set.Name,
replicas[i].Name)
if err := ssc.podControl.DeleteStatefulPod(set, replicas[i]); err != nil {
return &status, err
. For now Deployment and DaemonSet do not do this.

@gosoon
Copy link
Contributor

gosoon commented Sep 24, 2021

Why does kubernetes keep evicted pod, and what is the purpose of this design?

@rehevkor5
Copy link

One guess for the reason: so that you can look at the failed Pods and see what's happening in the cluster more easily (both in the API and in the metrics). If the Pods immediately disappeared, you'd probably need to use logs to discover what's happening, which is arguably more difficult.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. sig/node Categorizes an issue or PR as relevant to SIG Node.
Projects
None yet
Development

No branches or pull requests