Can't Running: Unable to set Type=notify in systemd service file #7

KarlCyan · 2019-11-20T02:59:21Z

Describe the bug

Container logs:

I1120 02:49:13.645198  262971 volume.go:152] Volume manager is running
E1120 02:49:13.645282  262971 server.go:132] Unable to set Type=notify in systemd service file?
I1120 02:49:14.019519  262971 app.go:87] Wait for internal server ready

And wait for internal server ready timeout 10s and will exit .

Environment

OS: centos 3.10.0-957.27.2.el7.x86_64
kubernetes: 1.13.2

The text was updated successfully, but these errors were encountered:

KarlCyan · 2019-11-20T03:02:33Z

I find function "SdNotify" use env "NOTIFY_SOCKET" to establish socket connection but I can't find it in the container env

mYmNeo · 2019-11-21T08:04:14Z

Unable to set Type=notify in systemd service file?
This error is not the reason why the server is down.

How did you run gpu-manager? Please provides the details?

KarlCyan · 2019-11-21T09:42:12Z

Unable to set Type=notify in systemd service file?
This error is not the reason why the server is down.

How did you run gpu-manager? Please provides the details?

Image name : gpu-manager:v1.0.0
I start using kubectl create -f gpu-manager.yaml , just change namespace and image name

mYmNeo · 2019-11-21T10:39:25Z

In /etc/gpu-manager/log directory of each node, it places the log of the gpu-manager. Could you find and paste it?

KarlCyan · 2019-11-25T06:36:29Z

In /etc/gpu-manager/log directory of each node, it places the log of the gpu-manager. Could you find and paste it?

I use parameter “--logtostderr” and run start.sh

# ... copy file

# ... mirror file

I1125 06:36:36.331062  477934 volume.go:158] Mirror /usr/local/nvidia/lib64/libnvcuvid.so.430.40 to /etc/gpu-manager/vdriver/origin/lib64
I1125 06:36:36.334182  477934 volume.go:158] Mirror /usr/local/nvidia/lib64/libcuda.so.430.40 to /etc/gpu-manager/vdriver/origin/lib64
I1125 06:36:36.346570  477934 volume.go:167] Driver version: 430.40
I1125 06:36:36.346592  477934 volume.go:158] Mirror /usr/local/nvidia/lib64/libOpenGL.so.0 to /etc/gpu-manager/vdriver/origin/lib64
I1125 06:36:36.347056  477934 volume.go:158] Mirror /usr/local/nvidia/lib64/libGLdispatch.so.0 to /etc/gpu-manager/vdriver/origin/lib64
I1125 06:36:36.347922  477934 volume.go:158] Mirror /usr/local/nvidia/lib64/libGLX_nvidia.so.430.40 to /etc/gpu-manager/vdriver/origin/lib64
I1125 06:36:36.349222  477934 volume.go:158] Mirror /usr/local/nvidia/lib64/libGLX.so.0 to /etc/gpu-manager/vdriver/origin/lib64
I1125 06:36:36.349545  477934 volume.go:158] Mirror /usr/local/nvidia/lib64/libGLESv2_nvidia.so.430.40 to /etc/gpu-manager/vdriver/origin/lib64
I1125 06:36:36.350007  477934 volume.go:158] Mirror /usr/local/nvidia/lib64/libGLESv2.so.2.1.0 to /etc/gpu-manager/vdriver/origin/lib64
I1125 06:36:36.350437  477934 volume.go:158] Mirror /usr/local/nvidia/lib64/libGLESv1_CM_nvidia.so.430.40 to /etc/gpu-manager/vdriver/origin/lib64
I1125 06:36:36.350856  477934 volume.go:158] Mirror /usr/local/nvidia/lib64/libGLESv1_CM.so.1.2.0 to /etc/gpu-manager/vdriver/origin/lib64
I1125 06:36:36.351242  477934 volume.go:158] Mirror /usr/local/nvidia/lib64/libGL.so.1.7.0 to /etc/gpu-manager/vdriver/origin/lib64
I1125 06:36:36.352192  477934 volume.go:158] Mirror /usr/local/nvidia/lib64/libEGL_nvidia.so.430.40 to /etc/gpu-manager/vdriver/origin/lib64
I1125 06:36:36.353594  477934 volume.go:158] Mirror /usr/local/nvidia/lib64/libEGL.so.1.1.0 to /etc/gpu-manager/vdriver/origin/lib64
I1125 06:36:36.354065  477934 volume.go:158] Mirror /usr/local/nvidia/bin/nvidia-cuda-mps-control to /etc/gpu-manager/vdriver/origin/bin
I1125 06:36:36.354429  477934 volume.go:158] Mirror /usr/local/nvidia/bin/nvidia-cuda-mps-server to /etc/gpu-manager/vdriver/origin/bin
I1125 06:36:36.354740  477934 volume.go:158] Mirror /usr/local/nvidia/bin/nvidia-debugdump to /etc/gpu-manager/vdriver/origin/bin
I1125 06:36:36.355194  477934 volume.go:158] Mirror /usr/local/nvidia/bin/nvidia-persistenced to /etc/gpu-manager/vdriver/origin/bin
I1125 06:36:36.355511  477934 volume.go:158] Mirror /usr/local/nvidia/bin/nvidia-smi to /etc/gpu-manager/vdriver/origin/bin
I1125 06:36:36.356749  477934 volume.go:189] Vcuda /usr/lib64/libcuda-control.so to /etc/gpu-manager/vdriver/nvidia/lib/libcuda.so.1
I1125 06:36:36.357310  477934 volume.go:200] Vcuda /usr/lib64/libcuda-control.so to /etc/gpu-manager/vdriver/nvidia/lib/libcuda.so
I1125 06:36:36.357841  477934 volume.go:189] Vcuda /usr/lib64/libcuda-control.so to /etc/gpu-manager/vdriver/nvidia/lib64/libcuda.so.1
I1125 06:36:36.358382  477934 volume.go:200] Vcuda /usr/lib64/libcuda-control.so to /etc/gpu-manager/vdriver/nvidia/lib64/libcuda.so
I1125 06:36:36.358888  477934 volume.go:215] Vcuda /usr/lib64/libcuda-control.so to /etc/gpu-manager/vdriver/nvidia/lib/libnvidia-ml.so.1
I1125 06:36:36.359418  477934 volume.go:226] Vcuda /usr/lib64/libcuda-control.so to /etc/gpu-manager/vdriver/nvidia/lib/libnvidia-ml.so
I1125 06:36:36.359932  477934 volume.go:215] Vcuda /usr/lib64/libcuda-control.so to /etc/gpu-manager/vdriver/nvidia/lib64/libnvidia-ml.so.1
I1125 06:36:36.360469  477934 volume.go:226] Vcuda /usr/lib64/libcuda-control.so to /etc/gpu-manager/vdriver/nvidia/lib64/libnvidia-ml.so
I1125 06:36:36.360490  477934 volume.go:135] Volume manager is running
E1125 06:36:36.360530  477934 server.go:114] Unable to set Type=notify in systemd service file?
I1125 06:36:36.759829  477934 app.go:68] Wait for internal server ready
I1125 06:36:37.760107  477934 app.go:68] Wait for internal server ready
I1125 06:36:38.760520  477934 app.go:68] Wait for internal server ready
I1125 06:36:39.760786  477934 app.go:68] Wait for internal server ready
I1125 06:36:40.762543  477934 app.go:68] Wait for internal server ready
I1125 06:36:41.763197  477934 app.go:68] Wait for internal server ready
I1125 06:36:42.763444  477934 app.go:68] Wait for internal server ready
I1125 06:36:43.763795  477934 app.go:68] Wait for internal server ready
I1125 06:36:44.764208  477934 app.go:68] Wait for internal server ready
W1125 06:36:45.764519  477934 app.go:74] Wait too long for server ready, restarting

mYmNeo · 2019-11-25T09:09:23Z

Did you have unix socket file like vcore.sock, vmemory.sock in /var/lib/kubelet/device-plugins/? Your log didn't have a pattern like Server %s is ready at %s, that means the plugin servers were not started.

KarlCyan · 2019-11-25T09:12:12Z

Did you have unix socket file like vcore.sock, vmemory.sock in /var/lib/kubelet/device-plugins/? Your log didn't have a pattern like Server %s is ready at %s, that means the plugin servers were not started.

I just have kubelet.sock in container path : /var/lib/kubelet/device-plugins/

mYmNeo · 2019-11-25T10:47:39Z

Your log indicated that gpu-manager was stuck at https://github.com/tkestack/gpu-manager/blob/master/pkg/server/server.go#L155

KarlCyan · 2019-11-26T08:49:33Z

Your log indicated that gpu-manager was stuck at https://github.com/tkestack/gpu-manager/blob/master/pkg/server/server.go#L155

I have solved the problem and gpu-manager is now working properly.
gpu-manager was stuck at https://github.com/tkestack/gpu-manager/blob/master/pkg/server/server.go#L141 , so I copy the config file from master node path /root/.kube/ to container path "/root/.kube/".

I think we need to add some notes and change gpu-manager.yaml

KarlCyan · 2019-11-26T08:51:06Z

The problem has been solved. I will close this issu

mYmNeo · 2019-11-26T09:29:27Z

Thanks for reporting this. I'll update the README about this ASAP

chenjie222 · 2020-06-12T11:59:06Z

hi KarlCyan, I have the same problem as you,

can you help me?
My docker cgroup is systemd，then,I changed the cgroup to cgroupfs， it has the same problem.

DeepDarkOdyssey · 2020-09-10T06:05:29Z

Same issue as @chenjie222 here, any luck to find a solution? I followed https://cloud.tencent.com/developer/article/1685122 this blog and everything works fine and all pods in running, while the gpu-manger-daemon pod logs shows it stuck at server.go: 132

Just like what @chenjie222 went into. I tried many solutions like change the drive from systemd to cgroup, it won't work. The pod is running with no response.
Meanwhile I found the extra flags that need to be set from https://github.com/tkestack/gpu-manager/blob/master/docs/faq.md, but when I configured the gpu-manager like this:

the pod can't get started, it seems like the extra flags here directly pass to the gpu-manager as a call option but it doesn't has such option named "cgroup-driver". What did I missing?

pandaoknight · 2023-07-17T12:03:14Z

After I install nvidia-container-toolkit, the Unable to set Type=notify in systemd service file problem disapeared.

KarlCyan closed this as completed Nov 26, 2019

KarlCyan mentioned this issue Dec 16, 2019

Undefined symbol error was reported when I set up gpu-manager #9

Closed

2015-10-10 mentioned this issue May 13, 2020

when I deployed the gpu-manager, I created one pod as README，the pod's status is UnexpectedAdmissionError! #24

Closed

XavierMoo mentioned this issue Jun 9, 2020

Cannot deploy the gpu-manager with errors #30

Closed

DeepDarkOdyssey mentioned this issue Sep 10, 2020

Unable to set Type=notify in systemd service file #40

Closed

Fvoiretryzig mentioned this issue Apr 12, 2022

Error: Unable to set Type=notify in systemd service file? #151

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't Running: Unable to set Type=notify in systemd service file #7

Can't Running: Unable to set Type=notify in systemd service file #7

KarlCyan commented Nov 20, 2019

KarlCyan commented Nov 20, 2019

mYmNeo commented Nov 21, 2019

KarlCyan commented Nov 21, 2019

mYmNeo commented Nov 21, 2019

KarlCyan commented Nov 25, 2019

mYmNeo commented Nov 25, 2019

KarlCyan commented Nov 25, 2019

mYmNeo commented Nov 25, 2019

KarlCyan commented Nov 26, 2019 •

edited

Loading

KarlCyan commented Nov 26, 2019

mYmNeo commented Nov 26, 2019

chenjie222 commented Jun 12, 2020

DeepDarkOdyssey commented Sep 10, 2020

pandaoknight commented Jul 17, 2023

Can't Running: Unable to set Type=notify in systemd service file #7

Can't Running: Unable to set Type=notify in systemd service file #7

Comments

KarlCyan commented Nov 20, 2019

Describe the bug

Environment

KarlCyan commented Nov 20, 2019

mYmNeo commented Nov 21, 2019

KarlCyan commented Nov 21, 2019

mYmNeo commented Nov 21, 2019

KarlCyan commented Nov 25, 2019

mYmNeo commented Nov 25, 2019

KarlCyan commented Nov 25, 2019

mYmNeo commented Nov 25, 2019

KarlCyan commented Nov 26, 2019 • edited Loading

KarlCyan commented Nov 26, 2019

mYmNeo commented Nov 26, 2019

chenjie222 commented Jun 12, 2020

DeepDarkOdyssey commented Sep 10, 2020

pandaoknight commented Jul 17, 2023

KarlCyan commented Nov 26, 2019 •

edited

Loading