You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am using v1.8.4 and I am having the problem that deleted namespace stays at "Terminating" state forever. I did "kubectl delete namespace XXXX" already.
fentas, mbrt, sjentzsch, gaui, palmerabollo and 114 more
I seem to be experiencing this issue with a rook deployment:
➜ tmp git:(master) ✗ kubectl delete namespace rook
Error from server (Conflict): Operation cannot be fulfilled on namespaces "rook": The system is ensuring all content is removed from this namespace. Upon completion, this namespace will automatically be purged by the system.
➜ tmp git:(master) ✗
I think it does have something to do with their CRD, I see this in the API server logs:
E0314 07:28:18.284942 1 crd_finalizer.go:275] clusters.rook.io failed with: timed out waiting for the condition
E0314 07:28:18.287629 1 crd_finalizer.go:275] clusters.rook.io failed with: Operation cannot be fulfilled on customresourcedefinitions.apiextensions.k8s.io "clusters.rook.io": the object has been modified; please apply your changes to the latest version and try again
I've deployed rook to a different namespace now, but I'm not able to create the cluster CRD:
➜ tmp git:(master) ✗ cat rook/cluster.yaml
apiVersion: rook.io/v1alpha1
kind: Cluster
metadata:
name: rook
namespace: rook-cluster
spec:
dataDirHostPath: /var/lib/rook-cluster-store
➜ tmp git:(master) ✗ kubectl create -f rook/
Error from server (MethodNotAllowed): error when creating "rook/cluster.yaml": the server does not allow this method on the requested resource (post clusters.rook.io)
➜ tmp git:(master) ✗ kubectl delete namespace fission
Error from server (Conflict): Operation cannot be fulfilled on namespaces "fission": The system is ensuring all content is removed from this namespace. Upon completion, this namespace will automatically be purged by the system.
➜ tmp git:(master) ✗ kubectl get pods -n fission
NAME READY STATUS RESTARTS AGE
storagesvc-7c5f67d6bd-72jcf 0/1 Terminating 0 8d
➜ tmp git:(master) ✗ kubectl delete pod/storagesvc-7c5f67d6bd-72jcf --force --grace-period=0
warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.
Error from server (NotFound): pods "storagesvc-7c5f67d6bd-72jcf" not found
➜ tmp git:(master) ✗ kubectl describe pod -n fission storagesvc-7c5f67d6bd-72jcf
Name: storagesvc-7c5f67d6bd-72jcf
Namespace: fission
Node: 10.13.37.5/10.13.37.5
Start Time: Tue, 06 Mar 2018 07:03:06 +0000
Labels: pod-template-hash=3719238268
svc=storagesvc
Annotations: <none>
Status: Terminating (expires Wed, 14 Mar 2018 06:41:32 +0000)
Termination Grace Period: 30s
IP: 10.244.2.240
Controlled By: ReplicaSet/storagesvc-7c5f67d6bd
Containers:
storagesvc:
Container ID: docker://3a1350f6e4871b1ced5c0e890e37087fc72ed2bc8410d60f9e9c26d06a40c457
Image: fission/fission-bundle:0.4.1
Image ID: docker-pullable://fission/fission-bundle@sha256:235cbcf2a98627cac9b0d0aae6e4ea4aac7b6e6a59d3d77aaaf812eacf9ef253
Port: <none>
Command:
/fission-bundle
Args:
--storageServicePort
8000
--filePath
/fission
State: Terminated
Exit Code: 0
Started: Mon, 01 Jan 0001 00:00:00 +0000
Finished: Mon, 01 Jan 0001 00:00:00 +0000
Ready: False
Restart Count: 0
Environment: <none>
Mounts:
/fission from fission-storage (rw)
/var/run/secrets/kubernetes.io/serviceaccount from fission-svc-token-zmsxx (ro)
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
fission-storage:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: fission-storage-pvc
ReadOnly: false
fission-svc-token-zmsxx:
Type: Secret (a volume populated by a Secret)
SecretName: fission-svc-token-zmsxx
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events: <none>
➜ tmp git:(master) ✗
Fission also uses CRDs, however, they appear to be cleaned up.
@shean-guangchang - I had the same issue. I've deleted everything under the namespaces manually, deleted and purged everything from "helm" and restarted the master nodes one by one and it fixed the issue.
I imagine what i've encountered has something to do with "ark", "tiller" and Kuberenets all working together (i bootstraped using helm and backed-up using ark) so this may not be a Kuberenets issue per say, on the other hand, it was pretty much impossible to troubleshot because there are no relevant logs.
I have a similar environment (Ark & Helm) with @barakAtSoluto and have the same issue. Purging and restarting the masters didn't fix it for me though. Still stuck at terminating.
I had that too when trying to recreate the problem. I eventually had to create a new cluster....
Exclude - default, kube-system/public and all ark related namespaces from backup and restore to prevent this from happening...
In my case, I had to manually delete my ingress load balancer from the GCP Network Service console. I had manually created the load balancer frontend directly in the console. Once I deleted the load balancer the namespace was automatically deleted.
I'm suspecting that Kubernetes didn't want to delete since the state of the load balancer was different than the state in the manifest.
I will try to automate the ingress frontend creation using annotations next to see if I can resolve this issue.
Sometimes only edit the resource manifest online would be not working very well(I mean remove the finalizers filed and save).
So, I got some new way from others.
Sometimes only edit the resource manifest online would be not working very well(I mean remove the finalizers filed and save).
So, I got some new way from others.
This is not the right way, especially in a production environment.
Today I got into the same problem. By removing the finalizer you’ll end up with leftovers in various states. You should actually find what is keeping the deletion from complete.
(also, unfortunately, ‘kubetctl get all’ does not report all things, you need to use similar commands like in the link)
My case — deleting ‘cert-manager’ namespace. In the output of ‘kubectl get apiservice -o yaml’ I found APIService ‘v1beta1.admission.certmanager.k8s.io’ with status=False . This apiservice was part of cert-manager, which I just deleted. So, in 10 seconds after I ‘kubectl delete apiservice v1beta1.admission.certmanager.k8s.io’ , the namespace disappeared.
Hope that helps.
With that being said, I wrote a little microservice to run as a CronJob every hour that automatically deletes Terminating namespaces.
Activity
dims commentedon Mar 7, 2018
/sig api-machinery
nikhita commentedon Mar 10, 2018
@shean-guangchang Do you have some way to reproduce this?
And out of curiosity, are you using any CRDs? We faced this problem with TPRs previously.
nikhita commentedon Mar 10, 2018
/kind bug
oliviabarrick commentedon Mar 14, 2018
I seem to be experiencing this issue with a rook deployment:
I think it does have something to do with their CRD, I see this in the API server logs:
I've deployed rook to a different namespace now, but I'm not able to create the cluster CRD:
Seems like the CRD was never cleaned up:
oliviabarrick commentedon Mar 14, 2018
I have a fission namespace in a similar state:
Fission also uses CRDs, however, they appear to be cleaned up.
barakAtSoluto commentedon Mar 22, 2018
@shean-guangchang - I had the same issue. I've deleted everything under the namespaces manually, deleted and purged everything from "helm" and restarted the master nodes one by one and it fixed the issue.
I imagine what i've encountered has something to do with "ark", "tiller" and Kuberenets all working together (i bootstraped using helm and backed-up using ark) so this may not be a Kuberenets issue per say, on the other hand, it was pretty much impossible to troubleshot because there are no relevant logs.
xetys commentedon Mar 23, 2018
if it is the rook one, take a look at this: rook/rook#1488 (comment)
oliviabarrick commentedon Mar 23, 2018
I guess that makes sense, but it seems buggy that it's possible to get a namespace into an undeletable state.
OguzPastirmaci commentedon Apr 26, 2018
I have a similar environment (Ark & Helm) with @barakAtSoluto and have the same issue. Purging and restarting the masters didn't fix it for me though. Still stuck at terminating.
barakAtSoluto commentedon Apr 29, 2018
I had that too when trying to recreate the problem. I eventually had to create a new cluster....
Exclude - default, kube-system/public and all ark related namespaces from backup and restore to prevent this from happening...
jaxxstorm commentedon May 3, 2018
I'm also seeing this too, on a cluster upgraded from 1.8.4 to 1.9.6. I don't even know what logs to look at
whyvez commentedon Jul 15, 2020
In my case, I had to manually delete my ingress load balancer from the GCP Network Service console. I had manually created the load balancer frontend directly in the console. Once I deleted the load balancer the namespace was automatically deleted.
I'm suspecting that Kubernetes didn't want to delete since the state of the load balancer was different than the state in the manifest.
I will try to automate the ingress frontend creation using annotations next to see if I can resolve this issue.
salluu commentedon Jul 25, 2020
you are a star it worked
Navaneeth-pk commentedon Aug 4, 2020
Tried a lot of solutions but this is the one that worked for me. Thank you!
alexcpn commentedon Aug 7, 2020
Better https://stackoverflow.com/a/59667608/429476
matthewoestreich commentedon Aug 21, 2020
This should really be the "accepted" answer - it completely resolved the root of this issue!
Take from the link above:
With that being said, I wrote a little microservice to run as a CronJob every hour that automatically deletes Terminating namespaces.
You can find it here: https://github.com/oze4/service.remove-terminating-namespaces
240 remaining items