Closed
Description
Output of helm version
:
version.BuildInfo{Version:"v3.2.0-rc.1", GitCommit:"7bffac813db894e06d17bac91d14ea819b5c2310", GitTreeState:"clean", GoVersion:"go1.13.10"}
Output of kubectl version
:
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.3", GitCommit:"435f92c719f279a3a67808c80521ea17d5715c66", GitTreeState:"clean", BuildDate:"2018-11-26T12:57:14Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.2", GitCommit:"c97fe5036ef3df2967d086711e6c0c405941e14b", GitTreeState:"clean", BuildDate:"2019-10-16T08:00:38Z", GoVersion:"go1.12.10", Compiler:"gc", Platform:"linux/amd64"}
Cloud Provider/Platform (AKS, GKE, Minikube etc.):
kubeadm
Problem description
Rollback failed, HELM can't find Endpoints
We have two version of a chart, 3.4.0-72 and 3.4.0-73
- v3.4.0-72 : Contains two endpoints.
$ kubectl -n demo2 get ep
NAME ENDPOINTS AGE
database-pg 192.168.214.210:9187,192.168.214.210:5432 9m25s
database-pg-replica 192.168.200.200:9187,192.168.200.200:5432 10m
-
v3.4.0-73: We removed those two endpoints, let SVC create endpoints with label-selector
-
The upgrade was successful
-
Rollback failed
$ ./helm3.2 history demo2 -n demo2
REVISION UPDATED STATUS CHART APP VERSION DESCRIPTION
1 Wed Apr 22 12:49:30 2020 superseded database-pg-3.4.0-72 0.0.0-0 Install complete
2 Wed Apr 22 12:51:52 2020 superseded database-pg-3.4.0-73-b7e958c23e 0.0.0-0 Upgrade complete
3 Wed Apr 22 12:54:26 2020 failed database-pg-3.4.0-72 0.0.0-0 Rollback "demo2" failed: no Endpoints with the name "database-pg" found
demo2@node-10:~
- More log
$ ./helm3.2 rollback demo2 -n demo2 --debug
rollback.go:60: [debug] preparing rollback of demo2
rollback.go:108: [debug] rolling back demo2 (current: v2, target: v1)
rollback.go:67: [debug] creating rolled back release for demo2
rollback.go:73: [debug] performing rollback of demo2
client.go:258: [debug] Starting delete for "database-pg" Role
client.go:108: [debug] creating 1 resource(s)
client.go:258: [debug] Starting delete for "database-pg" RoleBinding
client.go:108: [debug] creating 1 resource(s)
client.go:258: [debug] Starting delete for "database-pg-hook" ServiceAccount
client.go:108: [debug] creating 1 resource(s)
client.go:258: [debug] Starting delete for "database-pg-hook" Role
client.go:108: [debug] creating 1 resource(s)
client.go:258: [debug] Starting delete for "database-pg-hook" RoleBinding
client.go:108: [debug] creating 1 resource(s)
client.go:258: [debug] Starting delete for "database-pg-hook-cleanup" Job
client.go:287: [debug] jobs.batch "database-pg-hook-cleanup" not found
client.go:108: [debug] creating 1 resource(s)
client.go:467: [debug] Watching for changes to Job database-pg-hook-cleanup with timeout of 5m0s
client.go:495: [debug] Add/Modify event for database-pg-hook-cleanup: ADDED
client.go:534: [debug] database-pg-hook-cleanup: Jobs active: 0, jobs failed: 0, jobs succeeded: 0
client.go:495: [debug] Add/Modify event for database-pg-hook-cleanup: MODIFIED
client.go:534: [debug] database-pg-hook-cleanup: Jobs active: 1, jobs failed: 0, jobs succeeded: 0
client.go:495: [debug] Add/Modify event for database-pg-hook-cleanup: MODIFIED
client.go:258: [debug] Starting delete for "database-pg-hook-cleanup" Job
client.go:173: [debug] checking 8 resources for changes
rollback.go:166: [debug] warning: Rollback "demo2" failed: no Endpoints with the name "database-pg" found
Error: no Endpoints with the name "database-pg" found
helm.go:84: [debug] no Endpoints with the name "database-pg" found
helm.sh/helm/v3/pkg/kube.(*Client).Update.func1
/home/circleci/helm.sh/helm/pkg/kube/client.go:201
helm.sh/helm/v3/pkg/kube.ResourceList.Visit
/home/circleci/helm.sh/helm/pkg/kube/resource.go:32
helm.sh/helm/v3/pkg/kube.(*Client).Update
/home/circleci/helm.sh/helm/pkg/kube/client.go:174
helm.sh/helm/v3/pkg/action.(*Rollback).performRollback
/home/circleci/helm.sh/helm/pkg/action/rollback.go:162
helm.sh/helm/v3/pkg/action.(*Rollback).Run
/home/circleci/helm.sh/helm/pkg/action/rollback.go:74
main.newRollbackCmd.func1
/home/circleci/helm.sh/helm/cmd/helm/rollback.go:59
github.com/spf13/cobra.(*Command).execute
/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:842
github.com/spf13/cobra.(*Command).ExecuteC
/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:950
github.com/spf13/cobra.(*Command).Execute
/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:887
main.main
/home/circleci/helm.sh/helm/cmd/helm/helm.go:83
runtime.main
/usr/local/go/src/runtime/proc.go:203
runtime.goexit
/usr/local/go/src/runtime/asm_amd64.s:1357
- I can see that the rollback was still continued, POD was rolling upgraded, but the "history" has error. It will block next upgrade.
$ ./helm3.2 upgrade demo2--namespace demo2database-pg-3.4.0-73-b7e958c23e.tgz
Error: UPGRADE FAILED: "demo2" has no deployed releases
Additional info
- I did more test for upgrade from v3.4.0-73 to v3.4.0-72
$ ./helm3.2 upgrade demo2 --namespace demo2 database-pg-3.4.0-72.tgz
Error: UPGRADE FAILED: rendered manifests contain a resource that already exists. Unable to continue with update: Endpoints "database-pg" in namespace "demo2" exists and cannot be imported into the current release: invalid ownership metadata; annotation validation error: missing key "meta.helm.sh/release-name": must be set to "demo2"; annotation validation error: missing key "meta.helm.sh/release-namespace": must be set to "demo2"
demo2@node-10:~
Expectation
HELM3 can rollback to v3.4.0-72 which has two endpoints. No error should seen.
Metadata
Metadata
Assignees
Type
Projects
Milestone
Relationships
Development
No branches or pull requests
Activity
qingguee commentedon Apr 23, 2020
I had a scan in pkg/kube/client.go
I think above code has a problem. It does not consider my case.
During rollback
So, it was checking original and find Endpoint was not existed, then it return error.
Then WHY the NEW resource was not identified at line 179 and 180? It suppose be the new resource creation handling with return .
Because the Endpoint was created by K8s, not HELM3.
Not sure my analysis is correct or not, need someone from HELM to confirm.
BRs,
qingguee
bacongobbler commentedon Apr 23, 2020
Hi @qingguee, without a chart to test this behaviour, we cannot help you. Please provide a sample along with a set on instructions that we can use to verify the behaviour you are describing. Thanks.
qingguee commentedon Apr 24, 2020
Sure.
I will prepare two example chart and detail reproduce step to help identify the root cause.
I guess this weekend.
BRs,
qingguee
qingguee commentedon Apr 26, 2020
Hi @bacongobbler
I have create two example chart for reduce this issue. Also describe our use case.
Get example charts for rollback issue
They can find at
Reproduce the issue
Diff two chart
example-0.1.9:
example-0.2.1
Use case
Then we met this rollback issue.
qingguee commentedon Apr 27, 2020
Root cause
Endpoint could be created by K8S, not HELM.
Below code will get wrong feedback when handling endpoints during rollback. It will return nil for err, then skip the endpoint creation.
Proposal
github-actions commentedon Aug 21, 2020
This issue has been marked as stale because it has been open for 90 days with no activity. This thread will be automatically closed in 30 days if no further activity occurs.
bridgetkromhout commentedon Sep 4, 2020
Hi, @qingguee! Thank you for your ideas to improve Helm. If you'd like, you can submit a Helm Improvement Proposal so as to work in the community to make your ideas a reality: https://github.com/helm/community/blob/master/hips/hip-0001.md - meanwhile, I will close this issue. Thanks!
sig-abyreddy commentedon May 27, 2022
I have faced a similar issue with this piece of code while trying to install ingress-nginx helm chart. Here is the debug log,
Helm is trying to locate a newly created resource from an earlier release's state, which it shouldn't. As a result of that, helm is trying to rollback the release back to the stable state.
Helm version: v3.1.3