Skip to content

'ERROR: for snuba-api Container "ee666f7f2cdd" is unhealthy.' during install.sh #1178

@ynotna87

Description

@ynotna87

Version

21.11.0

Steps to Reproduce

  1. Run ./install.sh

Expected Result

Installation succeed

Actual Result

▶ Bootstrapping and migrating Snuba ...
Creating sentry_onpremise_clickhouse_1 ...
Creating sentry_onpremise_zookeeper_1 ...
Creating sentry_onpremise_redis_1 ...
Creating sentry_onpremise_redis_1 ... done
Creating sentry_onpremise_zookeeper_1 ... done
Creating sentry_onpremise_clickhouse_1 ... done
Creating sentry_onpremise_kafka_1 ...
Creating sentry_onpremise_kafka_1 ... done

ERROR: for snuba-api Container "ee666f7f2cdd" is unhealthy.
Encountered errors while bringing up the project.
An error occurred, caught SIGERR on line 3
Cleaning up...

Can anyone help me with this ?

Activity

chadwhitacre

chadwhitacre commented on Nov 30, 2021

@chadwhitacre
Member

Is this a clean install @ynotna87? If not, can you try a clean install?

ynotna87

ynotna87 commented on Dec 1, 2021

@ynotna87
Author

Is this a clean install @ynotna87? If not, can you try a clean install?

Hi Chad, yes it is a clean install from scratch on a new VM.

OS: Centos 7
Docker Version: 20.10.11
docker-compose Version: 1.29.2
Python Version: 3.6.8
vCPU: 8
RAM: 16G

aminvakil

aminvakil commented on Dec 1, 2021

@aminvakil
Collaborator

Try running docker-compose down -v --remove-orphans && docker volume prune -f && docker-compose up -d again.

Beware this will effectively removes all your data.

ynotna87

ynotna87 commented on Dec 2, 2021

@ynotna87
Author

Try running docker-compose down -v --remove-orphans && docker volume prune -f && docker-compose up -d again.

Beware this will effectively removes all your data.

Hi Amin,

i tried to run the command, but unfortunately the same error still the same. What could possibly wrong ?
Currently i try to request additional VM to try installing on that, but it's yet to be provisioned.

ynotna87

ynotna87 commented on Dec 2, 2021

@ynotna87
Author

Try running docker-compose down -v --remove-orphans && docker volume prune -f && docker-compose up -d again.
Beware this will effectively removes all your data.

Hi Amin,

i tried to run the command, but unfortunately the same error still the same. What could possibly wrong ? Currently i try to request additional VM to try installing on that, but it's yet to be provisioned.

Update : After i tried on another VM, it still giving the same error:

▶ Bootstrapping and migrating Snuba ...
Creating sentry_onpremise_clickhouse_1 ...
Creating sentry_onpremise_zookeeper_1 ...
Creating sentry_onpremise_redis_1 ...
Creating sentry_onpremise_redis_1 ... done
Creating sentry_onpremise_clickhouse_1 ... done
Creating sentry_onpremise_zookeeper_1 ... done
Creating sentry_onpremise_kafka_1 ...
Creating sentry_onpremise_kafka_1 ... done

ERROR: for snuba-api Container "05c4e3a60327" is unhealthy.
Encountered errors while bringing up the project.
An error occurred, caught SIGERR on line 3
Cleaning up...

chadwhitacre

chadwhitacre commented on Dec 2, 2021

@chadwhitacre
Member

Can you run with debugging and paste your full install log in a gist?

DEBUG=1 ./install.sh --no-user-prompt
AxTheB

AxTheB commented on Dec 3, 2021

@AxTheB

Had the same issue, rebooting the machine helped.

chadwhitacre

chadwhitacre commented on Dec 3, 2021

@chadwhitacre
Member

What is the snuba-api healthcheck? How is it failing? Why?

snuba-api:
<<: *depends_on-default

Curious to me that snuba-api doesn't seem on the surface to have a healthcheck. 🤔

aminvakil

aminvakil commented on Dec 4, 2021

@aminvakil
Collaborator

Curious to me that snuba-api doesn't seem on the surface to have a healthcheck. thinking

It does not.

docker ps | grep snuba-api
bab3ff609285   getsentry/snuba:21.9.0                 "./docker_entrypoint…"   2 hours ago   Up 2 hours             1218/tcp                                    sentry_onpremise_snuba-api_1

(It does not have a (healthy) after Up 2 hours).
I'm more confused now, why on the first run it has? 🤔

markdensen403

markdensen403 commented on Dec 5, 2021

@markdensen403

I'm also having the same error where snuba-api is failing. In fact, my entire dockers are down, and production is offline because I was trying to update our sentry server. Does anyone have any idea what is happening here?

AxTheB

AxTheB commented on Dec 5, 2021

@AxTheB

At that point it will fail for example when any of the started containers does not go up, try starting the zookeeper, clickhouse, redis and kafka containers manually and check their state.

43 remaining items

shaqaruden

shaqaruden commented on Feb 2, 2022

@shaqaruden

I had this same issue. After a failed install I ran docker-compose down which brought down all the new containers but I needed to run docker-compose down again which brought all the older containers. I ran the same command again to ensure everything was down and then ran ./install.sh which succeeded

...
ERROR: for snuba-api  Container "faf1ca068692" is unhealthy.
Encountered errors while bringing up the project.
An error occurred, caught SIGERR on line 3
Cleaning up...

[redacted] in sentry at [redacted] on  tags/22.1.0 [!?] on 🐳 v20.10.12 took 57s 
➜ dcd
Removing sentry-onpremise_kafka_1      ... done
Removing sentry-onpremise_clickhouse_1 ... done
Removing sentry-onpremise_zookeeper_1  ... done
Removing sentry-onpremise_redis_1      ... done
Removing network sentry-onpremise_default

[redacted] in sentry at [redacted] on  tags/22.1.0 [!?] on 🐳 v20.10.12 
➜ vim .env

[redacted] in sentry at [redacted] on  tags/22.1.0 [!?] on 🐳 v20.10.12 took 8s 
➜ dcd
Stopping sentry_onpremise_nginx_1                                    ... done
Stopping sentry_onpremise_relay_1                                    ... done
Stopping sentry_onpremise_worker_1                                   ... done
Stopping sentry_onpremise_cron_1                                     ... done
Stopping sentry_onpremise_subscription-consumer-events_1             ... done
Stopping sentry_onpremise_post-process-forwarder_1                   ... done
Stopping sentry_onpremise_sentry-cleanup_1                           ... done
Stopping sentry_onpremise_subscription-consumer-transactions_1       ... done
Stopping sentry_onpremise_ingest-consumer_1                          ... done
Stopping sentry_onpremise_web_1                                      ... done
Stopping sentry_onpremise_snuba-cleanup_1                            ... done
Stopping sentry_onpremise_snuba-transactions-cleanup_1               ... done
Stopping sentry_onpremise_symbolicator-cleanup_1                     ... done
Stopping sentry_onpremise_snuba-replacer_1                           ... done
Stopping sentry_onpremise_snuba-subscription-consumer-transactions_1 ... done
Stopping sentry_onpremise_snuba-sessions-consumer_1                  ... done
Stopping sentry_onpremise_snuba-outcomes-consumer_1                  ... done
Stopping sentry_onpremise_snuba-subscription-consumer-events_1       ... done
Stopping sentry_onpremise_snuba-consumer_1                           ... done
Stopping sentry_onpremise_snuba-api_1                                ... done
Stopping sentry_onpremise_snuba-transactions-consumer_1              ... done
Stopping sentry_onpremise_postgres_1                                 ... done
Stopping sentry_onpremise_smtp_1                                     ... done
Stopping sentry_onpremise_memcached_1                                ... done
Stopping sentry_onpremise_symbolicator_1                             ... done
Stopping sentry_onpremise_kafka_1                                    ... done
Stopping sentry_onpremise_clickhouse_1                               ... done
Stopping sentry_onpremise_zookeeper_1                                ... done
Stopping sentry_onpremise_redis_1                                    ... done
Removing sentry_onpremise_nginx_1                                    ... done
Removing sentry_onpremise_relay_1                                    ... done
Removing sentry_onpremise_worker_1                                   ... done
Removing sentry_onpremise_cron_1                                     ... done
Removing sentry_onpremise_subscription-consumer-events_1             ... done
Removing sentry_onpremise_post-process-forwarder_1                   ... done
Removing sentry_onpremise_sentry-cleanup_1                           ... done
Removing sentry_onpremise_subscription-consumer-transactions_1       ... done
Removing sentry_onpremise_ingest-consumer_1                          ... done
Removing sentry_onpremise_web_1                                      ... done
Removing sentry_onpremise_snuba-cleanup_1                            ... done
Removing sentry_onpremise_snuba-transactions-cleanup_1               ... done
Removing sentry_onpremise_symbolicator-cleanup_1                     ... done
Removing sentry_onpremise_geoipupdate_1                              ... done
Removing sentry_onpremise_snuba-replacer_1                           ... done
Removing sentry_onpremise_snuba-subscription-consumer-transactions_1 ... done
Removing sentry_onpremise_snuba-sessions-consumer_1                  ... done
Removing sentry_onpremise_snuba-outcomes-consumer_1                  ... done
Removing sentry_onpremise_snuba-subscription-consumer-events_1       ... done
Removing sentry_onpremise_snuba-consumer_1                           ... done
Removing sentry_onpremise_snuba-api_1                                ... done
Removing sentry_onpremise_snuba-transactions-consumer_1              ... done
Removing sentry_onpremise_postgres_1                                 ... done
Removing sentry_onpremise_smtp_1                                     ... done
Removing sentry_onpremise_memcached_1                                ... done
Removing sentry_onpremise_symbolicator_1                             ... done
Removing sentry_onpremise_kafka_1                                    ... done
Removing sentry_onpremise_clickhouse_1                               ... done
Removing sentry_onpremise_zookeeper_1                                ... done
Removing sentry_onpremise_redis_1                                    ... done
Removing network sentry_onpremise_default

[redacted] in sentry at [redacted] on  tags/22.1.0 [?] on 🐳 v20.10.12 took 26s 
➜ dcd
Removing network sentry-self-hosted_default
WARNING: Network sentry-self-hosted_default not found.

[redacted] in sentry at [redacted] on  tags/22.1.0 [?] on 🐳 v20.10.12 
➜ ./install.sh 
▶ Parsing command line ...

▶ Initializing Docker Compose ...

▶ Setting up error handling ...

...

-----------------------------------------------------------------

You're all done! Run the following command to get Sentry running:

  docker-compose up -d

-----------------------------------------------------------------
github-actions

github-actions commented on Feb 24, 2022

@github-actions

This issue has gone three weeks without activity. In another week, I will close it.

But! If you comment or otherwise update it, I will reset the clock, and if you label it Status: Backlog or Status: In Progress, I will leave it alone ... forever!


"A weed is but an unloved flower." ― Ella Wheeler Wilcox 🥀

ragesoss

ragesoss commented on Mar 1, 2022

@ragesoss

I ran into this when upgrading today, and had to use docker stop $(docker ps -q) to get the upgrade to work properly. (docker compose down was not sufficient.)

aminvakil

aminvakil commented on Mar 2, 2022

@aminvakil
Collaborator

I ran into this when upgrading today, and had to use docker stop $(docker ps -q) to get the upgrade to work properly. (docker compose down was not sufficient.)

@ragesoss If this has happened from a version < 21.12.0 to a version >= 21.12.0, docker-compose down -v --remove-orphans should have worked. Or executing docker-compose down -v before checking out new version.

iburrows

iburrows commented on Mar 15, 2022

@iburrows

I recently upgraded from 20.12.1 -> 21.6.3 -> 22.2.0 and running docker-compose down -v --remove-orphans did not help going from 21.6.3 -> 22.2.0. The only way I could get it running was to set the clickhouse healthcheck to test: "exit 0" and then it started up. I looked at #1081 as this was the same error I was seeing but there is nothing different in master to the latest release (currently 22.2.0). I probably broke something with setting test: "exit 0" but it started up here are the logs from clickhouse.

$ docker logs 2047e3808568
Processing configuration file '/etc/clickhouse-server/config.xml'.
Merging configuration file '/etc/clickhouse-server/config.d/docker_related_config.xml'.
Merging configuration file '/etc/clickhouse-server/config.d/sentry.xml'.
Include not found: clickhouse_remote_servers
Include not found: clickhouse_compression
Logging information to /var/log/clickhouse-server/clickhouse-server.log
Logging errors to /var/log/clickhouse-server/clickhouse-server.err.log
Logging information to console
2022.03.15 14:32:13.546698 [ 1 ] {} <Information> : Starting ClickHouse 20.3.9.70 with revision 54433
2022.03.15 14:32:13.549431 [ 1 ] {} <Information> Application: starting up
Include not found: networks
2022.03.15 14:32:13.565627 [ 1 ] {} <Information> Application: Uncompressed cache size was lowered to 1.90 GiB because the system has low amount of memory
2022.03.15 14:32:13.565955 [ 1 ] {} <Information> Application: Mark cache size was lowered to 1.90 GiB because the system has low amount of memory
2022.03.15 14:32:13.565998 [ 1 ] {} <Information> Application: Loading metadata from /var/lib/clickhouse/
2022.03.15 14:32:13.567895 [ 1 ] {} <Information> DatabaseOrdinary (system): Total 2 tables and 0 dictionaries.
2022.03.15 14:32:13.571923 [ 44 ] {} <Information> BackgroundProcessingPool: Create BackgroundProcessingPool with 16 threads
2022.03.15 14:32:13.832585 [ 1 ] {} <Information> DatabaseOrdinary (system): Starting up tables.
2022.03.15 14:32:13.843690 [ 1 ] {} <Information> DatabaseOrdinary (default): Total 13 tables and 0 dictionaries.
2022.03.15 14:32:13.944329 [ 1 ] {} <Information> DatabaseOrdinary (default): Starting up tables.
2022.03.15 14:32:13.947507 [ 1 ] {} <Information> BackgroundSchedulePool: Create BackgroundSchedulePool with 16 threads
2022.03.15 14:32:13.948176 [ 1 ] {} <Information> Application: It looks like the process has no CAP_NET_ADMIN capability, 'taskstats' performance statistics will be disabled. It could happen due to incorrect ClickHouse package installation. You could resolve the problem manually with 'sudo setcap cap_net_admin=+ep /usr/bin/clickhouse'. Note that it will not work on 'nosuid' mounted filesystems. It also doesn't work if you run clickhouse-server inside network namespace as it happens in some containers.
2022.03.15 14:32:13.948212 [ 1 ] {} <Information> Application: It looks like the process has no CAP_SYS_NICE capability, the setting 'os_thread_nice' will have no effect. It could happen due to incorrect ClickHouse package installation. You could resolve the problem manually with 'sudo setcap cap_sys_nice=+ep /usr/bin/clickhouse'. Note that it will not work on 'nosuid' mounted filesystems.
2022.03.15 14:32:13.950341 [ 1 ] {} <Error> Application: Listen [::]:8123 failed: Poco::Exception. Code: 1000, e.code() = 0, e.displayText() = DNS error: EAI: -9 (version 20.3.9.70 (official build)). If it is an IPv6 or IPv4 address and your host has disabled IPv6 or IPv4, then consider to specify not disabled IPv4 or IPv6 address to listen in <listen_host> element of configuration file. Example for disabled IPv6: <listen_host>0.0.0.0</listen_host> . Example for disabled IPv4: <listen_host>::</listen_host>
2022.03.15 14:32:13.950672 [ 1 ] {} <Error> Application: Listen [::]:9000 failed: Poco::Exception. Code: 1000, e.code() = 0, e.displayText() = DNS error: EAI: -9 (version 20.3.9.70 (official build)). If it is an IPv6 or IPv4 address and your host has disabled IPv6 or IPv4, then consider to specify not disabled IPv4 or IPv6 address to listen in <listen_host> element of configuration file. Example for disabled IPv6: <listen_host>0.0.0.0</listen_host> . Example for disabled IPv4: <listen_host>::</listen_host>
2022.03.15 14:32:13.950951 [ 1 ] {} <Error> Application: Listen [::]:9009 failed: Poco::Exception. Code: 1000, e.code() = 0, e.displayText() = DNS error: EAI: -9 (version 20.3.9.70 (official build)). If it is an IPv6 or IPv4 address and your host has disabled IPv6 or IPv4, then consider to specify not disabled IPv4 or IPv6 address to listen in <listen_host> element of configuration file. Example for disabled IPv6: <listen_host>0.0.0.0</listen_host> . Example for disabled IPv4: <listen_host>::</listen_host>
2022.03.15 14:32:13.951219 [ 1 ] {} <Error> Application: Listen [::]:9004 failed: Poco::Exception. Code: 1000, e.code() = 0, e.displayText() = DNS error: EAI: -9 (version 20.3.9.70 (official build)). If it is an IPv6 or IPv4 address and your host has disabled IPv6 or IPv4, then consider to specify not disabled IPv4 or IPv6 address to listen in <listen_host> element of configuration file. Example for disabled IPv6: <listen_host>0.0.0.0</listen_host> . Example for disabled IPv4: <listen_host>::</listen_host>
2022.03.15 14:32:13.951330 [ 1 ] {} <Information> Application: Listening for http://0.0.0.0:8123
2022.03.15 14:32:13.951442 [ 1 ] {} <Information> Application: Listening for connections with native protocol (tcp): 0.0.0.0:9000
2022.03.15 14:32:13.951534 [ 1 ] {} <Information> Application: Listening for replica communication (interserver): http://0.0.0.0:9009
2022.03.15 14:32:14.126773 [ 1 ] {} <Information> Application: Listening for MySQL compatibility protocol: 0.0.0.0:9004
2022.03.15 14:32:14.127392 [ 1 ] {} <Information> Application: Available RAM: 3.80 GiB; physical cores: 2; logical cores: 2.
2022.03.15 14:32:14.127416 [ 1 ] {} <Information> Application: Ready for connections.
Include not found: clickhouse_remote_servers
Include not found: clickhouse_compression
jthomaschewski

jthomaschewski commented on Mar 15, 2022

@jthomaschewski

I recently upgraded from 20.12.1 -> 21.6.3 -> 22.2.0 and running docker-compose down -v --remove-orphans did not help

I think this needs to be run before checking out the new release branch/tag.
The issue is, that --remove-orphans only removes orphan containers of the same docker-compose project. But as the project name changed, it won't discover/cleanup running containers of the old project.

So running docker-compose down while stilling having the old version checked out should stop and remove all containers. Then checking out the new version and upping/running install.sh should work

chadwhitacre

chadwhitacre commented on Mar 15, 2022

@chadwhitacre
Member

In that case would something like #1384 address this?

chadwhitacre

chadwhitacre commented on Mar 15, 2022

@chadwhitacre
Member

Here goes nothin'. ¯\_(ツ)_/¯

jthomaschewski

jthomaschewski commented on Mar 15, 2022

@jthomaschewski

In that case would something like #1384 address this?

lgtm, I believe this should fix this issue, thanks!
I haven't tested it though as I've upgraded my instances a while ago.

locked and limited conversation to collaborators on Mar 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Participants

      @chadwhitacre@rwky@nttdocomo@frame@AxTheB

      Issue actions

        'ERROR: for snuba-api Container "ee666f7f2cdd" is unhealthy.' during install.sh · Issue #1178 · getsentry/self-hosted