New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[signal SIGSEGV: segmentation violation code=0x1 addr=0x78 pc=0x5ff488] #2969
Comments
This looks like a race condition introduced in #2826 I don't understand why we can simply remove the lock for stopped state. Based on the PR description, I think we can have a finer grained lock for pid, instead of removing the lock directly. Removing the lock introduces race conditions, the I mark this p0, because this means that if we exec into a container multiple times, it may panic the containerd-shim... Which sounds really really bad to me. Think about that users usually use exec to do liveness probe, but the liveness probe may kill the containrd-shim and eventually kill the container if I remember the |
You are correct that my crashing containers all use health check probes and health check log message is the last one I see before the shim is reaped, thus your multi exec race seems valid. |
Thanks for your great work guys, anything I can help do to speed up official push of the release 1.2.3 to the docker? |
Here are our two questions about this bug: |
Description
Running
Docker version 18.09.1, build 4c52b90
onUbuntu 18.04.1 LTS (GNU/Linux 4.15.0-44-generic x86_64)
. My various containers, all running swift code based on official swift images from https://hub.docker.com/_/swift latest tag are crashing randomly every couple of days.I only receive a message
"shim reaped"
and because my docker container restart policy is set tounless-stopped
the container restarts automatically and runs again for couple days.Looking at all the logs and docker stats the container memory usage is stable, process inside the container is not reporting any troubles.
Enabled
containerd
debug and found the stack trace pasted below. The same stack trace is reported from multiple containers running different swift projects.Describe the results you received:
Container randomly restarting without any apparent cause, only
"shim reaped"
being reported to the logs.Describe the results you expected:
Container running smoothly for years :)
I'm not sure if this
"shim reaped"
is a result of silent crash of the process inside the container, in which case I'd love to get more diagnostics as to what happened and to which process. Right now I'm not sure if this is the crash of my app, or crash ofcontainers-shim
. Please help me clarify. The stack trace below is pointing to null pointer dereference incontainerd
go code.Output of
containerd --version
:The text was updated successfully, but these errors were encountered: