Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StatefulSet - can't rollback from a broken state #67250

Open
MrTrustor opened this issue Aug 10, 2018 · 64 comments
Open

StatefulSet - can't rollback from a broken state #67250

MrTrustor opened this issue Aug 10, 2018 · 64 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling.

Comments

@MrTrustor
Copy link

MrTrustor commented Aug 10, 2018

/kind bug

What happened:

I updated a StatefulSet with a non-existent Docker image. As expected, a pod of the statefulset is destroyed and can't be recreated (ErrImagePull). However, when I change back the StatefulSet with an existing image, the StatefulSet doesn't try to remove the broken pod to replace it by a good one. It keeps trying to pull the non-existing image.
You have to delete the broken pod manually to unblock the situation.

Related Stackoverflow question

What you expected to happen:

When rolling back the bad config, I expected the StatefulSet to remove the broken pod and replace it by a good one.

How to reproduce it (as minimally and precisely as possible):

  1. Deploy the following StatefulSet:
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: web
spec:
  selector:
    matchLabels:
      app: nginx # has to match .spec.template.metadata.labels
  serviceName: "nginx"
  replicas: 3 # by default is 1
  template:
    metadata:
      labels:
        app: nginx # has to match .spec.selector.matchLabels
    spec:
      terminationGracePeriodSeconds: 10
      containers:
      - name: nginx
        image: k8s.gcr.io/nginx-slim:0.8
        ports:
        - containerPort: 80
          name: web
        volumeMounts:
        - name: www
          mountPath: /usr/share/nginx/html
  volumeClaimTemplates:
  - metadata:
      name: www
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: "standard"
      resources:
        requests:
          storage: 10Gi
  1. Once the 3 pods are running, update the StatefulSet spec and change the image to k8s.gcr.io/nginx-slim:foobar
  2. Observe the new pod failing to pull the image.
  3. Roll back the change.
  4. Observe the broken pod not being deleted.

Anything else we need to know?:

Environment:

  • Kubernetes version (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.7", GitCommit:"dd5e1a2978fd0b97d9b78e1564398aeea7e7fe92", GitTreeState:"clean", BuildDate:"2018-04-19T00:05:56Z", GoVersion:"go1.9
.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"10+", GitVersion:"v1.10.5-gke.3", GitCommit:"6265b9797fc8680c8395abeab12c1e3bad14069a", GitTreeState:"clean", BuildDate:"2018-07-19T23:02:51Z", GoVersi
on:"go1.9.3b4", Compiler:"gc", Platform:"linux/amd64"}
  • Cloud provider or hardware configuration: Google Kubernetes Engine
  • OS (e.g. from /etc/os-release): COS

cc @joe-boyce

@k8s-ci-robot k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. kind/bug Categorizes issue or PR as related to a bug. labels Aug 10, 2018
@MrTrustor
Copy link
Author

MrTrustor commented Aug 10, 2018

/sig apps
/sig scheduling

@k8s-ci-robot k8s-ci-robot added sig/apps Categorizes an issue or PR as relevant to SIG Apps. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Aug 10, 2018
@joe-boyce
Copy link

Anybody have any ideas on this one?

@k8s-ci-robot k8s-ci-robot added the sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. label Aug 21, 2018
@enisoc
Copy link
Member

enisoc commented Aug 21, 2018

As far as I can tell, StatefulSet doesn't make any attempt to support this use case, namely using a rolling update to fix a StatefulSet that's in a broken state. If any of the existing Pods are broken, it appears that StatefulSet bails out before even reaching the rolling update code:

if !isRunningAndReady(replicas[i]) && monotonic {
glog.V(4).Infof(
"StatefulSet %s/%s is waiting for Pod %s to be Running and Ready",
set.Namespace,
set.Name,
replicas[i].Name)
return &status, nil
}

I haven't found any mention of this limitation in the docs, but it's possible that it was a choice made intentionally to err on the side of caution (stop and make the human decide) since stateful data is at stake and stateful Pods often have dependencies on each other (e.g. they may form a cluster/quorum).

With that said, I agree it would be ideal if StatefulSet supported this, at least for clear cases like this one where deleting a Pod that's stuck Pending is unlikely to cause any additional damage.

cc @kow3ns

@mattmb
Copy link

mattmb commented Sep 7, 2018

+1, I just discovered this and had assumed that it would work more like the Deployment controller.

In https://github.com/yelp/paasta we are programmatically creating/updating Deployments and StatefulSets. For that to make sense I really want them to always attempt to converge to the definition.

bonifaido added a commit to bank-vaults/bank-vaults that referenced this issue Sep 13, 2018
This option gives us the option to workaround current StatefulSet limitations around updates
See: kubernetes/kubernetes#67250
By default it is false.
matyix pushed a commit to bank-vaults/bank-vaults that referenced this issue Sep 13, 2018
This option gives us the option to workaround current StatefulSet limitations around updates
See: kubernetes/kubernetes#67250
By default it is false.
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 6, 2018
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 5, 2019
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@mattmb
Copy link

mattmb commented Feb 7, 2019

/reopen

@k8s-ci-robot
Copy link
Contributor

@mattmb: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@mattmb
Copy link

mattmb commented Feb 7, 2019

Heh, well it was worth a go I suppose...

@MrTrustor
Copy link
Author

/reopen

@k8s-ci-robot
Copy link
Contributor

@MrTrustor: Reopened this issue.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot reopened this Feb 8, 2019
@huzhengchuan
Copy link
Contributor

+1 meet the same problem
I think need to rollback success. but now it blocked

@dave-tock
Copy link

+1. This is a pretty big landmine in using StatefulSet, if you ever make any mistake you're stuck with just destroying your StatefulSet and starting over. IOW, if you ever make a mistake with StatefulSet, you need to cause an outage to recover :(

@krmayankk
Copy link

/remove-lifecycle rotten

f41gh7 added a commit to VictoriaMetrics/operator that referenced this issue Dec 25, 2021
with rollingUpdateStrategy: RollingUpdate, operator doens't perform statefulset update
it delegates it to the kubernetes statefulset controller.

Default strategy is OnDelete, which controlls rolling update process by operator

RollingUpdate strategy may be useful, when user wants to perform kubectl restart command
or for some reason don't trust rolling update process for operator

 Fixes annotations merge for pod templates, which make possible kubectl restart command for other services

related kubernetes issue with statefulset broken state and why manual rolling update may be needed.
 kubernetes/kubernetes#67250

#389
@saumeya
Copy link

saumeya commented Apr 4, 2022

This issue also affects our operator - https://github.com/redhat-developer/gitops-operator. Any updates on this problem?

@yifan-gu-anchorage
Copy link

Hitting the same issue in k8s version 1.21.6

@a-hilaly
Copy link
Member

/assign

@a-hilaly
Copy link
Member

/unassign

@tomasz-dudziak
Copy link

+1

@kerthcet
Copy link
Member

/assign
I'll take a look as if this is highly needed by the community.

@kerthcet
Copy link
Member

FYI: KEP initialized here kubernetes/enhancements#3562, glad to hear any advices.

sf-project-io pushed a commit to softwarefactory-project/sf-operator that referenced this issue Aug 10, 2023
This change ensures that the generate-tenant-config script when running via the zuul-scheduler pod's init-container won't fail when the config location is unaccurate.

Having the init-container to success even when the main.yaml file cannot be fully computed ensures that zuul-scheduler pod is in the running state.

Indeed, without that change it is impossible (except manually) to recover after a mistake in the config location setting. Three components must restart when the location change: git-server, nodepool, zuul-scheduler. While components restart work well for git-server and nodepool it is broken for zuul-scheduler.

According to https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#forced-rollback a rollout will only happen when the Pod is Ready which is not the case when the zuul-scheduler pod failed to start because of the init-container failure.

More insights here: kubernetes/kubernetes#67250

The approach implemented in this patch is quite simple as it is just about to bypass the init-container failures and ensures the pod start. The idea is to make it easy to just recover from a wrong config location setting.

Change-Id: I6885605d30b5ffc8c4c327dc4590b56316bd7955
andreasgerstmayr added a commit to andreasgerstmayr/tempo-operator that referenced this issue Sep 11, 2023
Changes to a StatefulSet are not propagated to pods in a broken state (e.g. CrashLoopBackOff)
See kubernetes/kubernetes#67250

This is a workaround for the above issue.

Signed-off-by: Andreas Gerstmayr <agerstmayr@redhat.com>
andreasgerstmayr added a commit to andreasgerstmayr/tempo-operator that referenced this issue Sep 12, 2023
Changes to a StatefulSet are not propagated to pods in a broken state (e.g. CrashLoopBackOff)
See kubernetes/kubernetes#67250

This is a workaround for the above issue.

Signed-off-by: Andreas Gerstmayr <agerstmayr@redhat.com>
andreasgerstmayr added a commit to andreasgerstmayr/tempo-operator that referenced this issue Sep 13, 2023
Changes to a StatefulSet are not propagated to pods in a broken state (e.g. CrashLoopBackOff)
See kubernetes/kubernetes#67250

This is a workaround for the above issue.

Signed-off-by: Andreas Gerstmayr <agerstmayr@redhat.com>
andreasgerstmayr added a commit to andreasgerstmayr/tempo-operator that referenced this issue Sep 13, 2023
Changes to a StatefulSet are not propagated to pods in a broken state (e.g. CrashLoopBackOff)
See kubernetes/kubernetes#67250

This is a workaround for the above issue.

Signed-off-by: Andreas Gerstmayr <agerstmayr@redhat.com>
andreasgerstmayr added a commit to andreasgerstmayr/tempo-operator that referenced this issue Sep 13, 2023
Changes to a StatefulSet are not propagated to pods in a broken state (e.g. CrashLoopBackOff)
See kubernetes/kubernetes#67250

This is a workaround for the above issue.

Signed-off-by: Andreas Gerstmayr <agerstmayr@redhat.com>
andreasgerstmayr added a commit to grafana/tempo-operator that referenced this issue Sep 19, 2023
Changes to a StatefulSet are not propagated to pods in a broken state (e.g. CrashLoopBackOff)
See kubernetes/kubernetes#67250

This is a workaround for the above issue.

Signed-off-by: Andreas Gerstmayr <agerstmayr@redhat.com>
@wingyiu
Copy link

wingyiu commented Mar 26, 2024

/reopen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling.
Projects
Status: Needs Triage
Status: Needs Triage