Closed
Description
/kind feature
What happened:
Kubelet has evicted pods due to disk pressure. Eventually, the disk pressure went away and the pods were scheduled and started again, but the evicted pods remained in the list of pods (kubectl get pod --show-all
).
What you expected to happen:
Wouldn't it be better if the kubelet would have deleted those evicted pods? The expected behaviour would therefore be to not see the evicted pods anymore, i.e. that they get deleted.
How to reproduce it (as minimally and precisely as possible):
Start kubelet with --eviction-hard
and --eviction-soft
with high thresholds or fill up the disk of a worker node.
Environment:
- Kubernetes version (use
kubectl version
): 1.8.2 - Cloud provider or hardware configuration: AWS
- OS (e.g. from /etc/os-release): Container Linux 1465.7.0 (Ladybug)
- Kernel (e.g.
uname -a
): 4.12.10-coreos
Metadata
Metadata
Assignees
Labels
Type
Projects
Milestone
Relationships
Development
No branches or pull requests
Activity
rfranzke commentedon Nov 3, 2017
/sig node
ghost commentedon Nov 3, 2017
Evicted pods in 1.7.5 are a madness, the deletion of those pods are delayed by days!!!!, for example, I have a pod since 17 days was evicted an appear in the pod list:
mynamespace nginxrepo-2549817493-0p91t 0/1 Evicted 0 17d
mynamespace linting-umbrellabird-monocular-api 0/1 Evicted 0 17d
In the case of nginxrepo, the deployment does not exist anymore, but the pods are present in the list of pods as evicted!!!
Also delete pods that does not match node selector criteria:
nfs-3970785943-5rnn7 0/2 MatchNodeSelector 0 17d
After 17 days the pod appear in the list!!. This behavior affect for example in Grafana, because the pods appear in the list of available pods for monitoring, and of course, are evicted!!.
By the way @rfranzke this is not a feature request, this is an issue!!!! Please, could you re-tag the case?
Regards
rfranzke commentedon Nov 3, 2017
/kind bug
rfranzke commentedon Nov 3, 2017
/remove-kind feature
ghost commentedon Nov 3, 2017
Thank you @rfranzke
liggitt commentedon Nov 3, 2017
is this a duplicate of #54525
from #54525 (comment) it sounds like this is intentional, though I'm not sure what is expected to clean up pods in this case
yujuhong commentedon Nov 3, 2017
The PodGCController in the controller manager?
krallistic commentedon Nov 7, 2017
A quick workaround we use, is to delete all evicted pods manually after an incident:
kubectl get pods --all-namespaces -ojson | jq -r '.items[] | select(.status.reason!=null) | select(.status.reason | contains("Evicted")) | .metadata.name + " " + .metadata.namespace' | xargs -n2 -l bash -c 'kubectl delete pods $0 --namespace=$1'
Not as nice as automatic delete, but it works. (Tested with 1.6.7, i heard in 1.7 you need to add --show-all)
ghost commentedon Nov 7, 2017
thank you @krallistic, I apply your workaround as cronjob a long time ago but is not a right way!
@liggitt is not duplicated, this is not the case when timestamp are not set properly for differents deployments (daemonset, statefulset, etc), here, appear that all of deployments does not have timestamp for deletion; and this affect the monitoring of the kubernetes environment!!
fejta-bot commentedon Feb 7, 2018
Issues go stale after 90d of inactivity.
Mark the issue as fresh with
/remove-lifecycle stale
.Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with
/close
.Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
19 remaining items