Skip to content

spring lettuce startup get blocks when including spring-cloud-starter-zipkin #1557

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
WEIZIBIN opened this issue Feb 12, 2020 · 19 comments
Closed
Milestone

Comments

@WEIZIBIN
Copy link

Describe the bug
Spring redis get connection block when include spring-cloud-starter-zipkin.

2020-02-12 11:31:27.025 INFO [,,,] 25158 --- [ main] trationDelegate$BeanPostProcessorChecker : Bean 'org.springframework.cloud.sleuth.zipkin2.sender.ZipkinRestTemplateSenderConfiguration' of type [org.springframework.cloud.sleuth.zipkin2.sender.ZipkinRestTemplateSenderConfiguration] is not eligible for getting processed by all BeanPostProcessors (for example: not eligible for auto-proxying)
2020-02-12 11:31:27.027 INFO [,,,] 25158 --- [ main] trationDelegate$BeanPostProcessorChecker : Bean 'zipkinRestTemplateCustomizer' of type [org.springframework.cloud.sleuth.zipkin2.DefaultZipkinRestTemplateCustomizer] is not eligible for getting processed by all BeanPostProcessors (for example: not eligible for auto-proxying)
2020-02-12 11:31:27.038 INFO [,,,] 25158 --- [ main] trationDelegate$BeanPostProcessorChecker : Bean 'zipkinSender' of type [org.springframework.cloud.sleuth.zipkin2.sender.RestTemplateSender] is not eligible for getting processed by all BeanPostProcessors (for example: not eligible for auto-proxying)
2020-02-12 11:31:27.075 WARN [,,,] 25158 --- [ main] o.s.c.s.zipkin2.ZipkinAutoConfiguration : Check result of the [AsyncReporter{org.springframework.cloud.sleuth.zipkin2.sender.RestTemplateSender@1dba4e06}] contains an error [CheckResult{ok=false, error=org.springframework.web.client.ResourceAccessException: I/O error on POST request for "http://localhost:9411/api/v2/spans": Connection refused (Connection refused); nested exception is java.net.ConnectException: Connection refused (Connection refused)}]
2020-02-12 11:31:27.076 INFO [,,,] 25158 --- [ main] trationDelegate$BeanPostProcessorChecker : Bean 'zipkinReporter' of type [zipkin2.reporter.AsyncReporter$BoundedAsyncReporter] is not eligible for getting processed by all BeanPostProcessors (for example: not eligible for auto-proxying)
2020-02-12 11:31:27.086 INFO [,,,] 25158 --- [ main] trationDelegate$BeanPostProcessorChecker : Bean 'tracing' of type [brave.Tracing$Default] is not eligible for getting processed by all BeanPostProcessors (for example: not eligible for auto-proxying)
2020-02-12 11:31:27.089 INFO [,,,] 25158 --- [ main] trationDelegate$BeanPostProcessorChecker : Bean 'spring.sleuth.redis-org.springframework.cloud.sleuth.instrument.redis.TraceRedisProperties' of type [org.springframework.cloud.sleuth.instrument.redis.TraceRedisProperties] is not eligible for getting processed by all BeanPostProcessors (for example: not eligible for auto-proxying)
2020-02-12 11:31:27.454 INFO [,,,] 25158 --- [ main] io.lettuce.core.EpollProvider : Starting without optional epoll library
2020-02-12 11:31:27.455 INFO [,,,] 25158 --- [ main] io.lettuce.core.KqueueProvider : Starting without optional kqueue library

If pom.xml not include spring-cloud-starter-zipkin, then redis success get connection.

Sample
Sample Project
block in io.lettuce.core.AbstractRedisClient#getConnection()

@marcingrzejszczak
Copy link
Contributor

@mp911de have you seen such a problem before?

@mp911de
Copy link
Member

mp911de commented Feb 12, 2020

Do you mean Connection refused (Connection refused); nested exception is java.net.ConnectException: Connection refused (Connection refused)}] or the bean-postprocesors?

@marcingrzejszczak
Copy link
Contributor

From what I understand this the problem block in io.lettuce.core.AbstractRedisClient#getConnection() . All the entries in the logs are not related to the issue as far as I understand.

BTW @WEIZIBIN you know that we have Hoxton.SR1 ?

@mp911de
Copy link
Member

mp911de commented Feb 12, 2020

The application startup gets stuck (blocked). There are two things that come together:

  1. Calling Redis in @PostConstruct which awaits command completion since it's a blocking call.
  2. Background initialization of Brave in combination with AOP. The I/O thread gets blocked by Brave/AOP by the Spring Framework initialization lock.

Since the context is not fully started, the startup lock prevents progress each time a thread different than main tries to obtain a Spring bean. Here's the stack trace of the blocked thread:

"lettuce-nioEventLoop-4-1@6556" daemon prio=5 tid=0x26 nid=NA waiting for monitor entry
  java.lang.Thread.State: BLOCKED
	 waiting for main@1 to release lock on <0x1c0c> (a java.util.concurrent.ConcurrentHashMap)
	  at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:205)
	  at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:321)
	  at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:202)
	  at org.springframework.beans.factory.support.ConstructorResolver.instantiateUsingFactoryMethod(ConstructorResolver.java:409)
	  at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.instantiateUsingFactoryMethod(AbstractAutowireCapableBeanFactory.java:1338)
	  at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBeanInstance(AbstractAutowireCapableBeanFactory.java:1177)
	  at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:557)
	  at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:517)
	  at org.springframework.beans.factory.support.AbstractBeanFactory.lambda$doGetBean$1(AbstractBeanFactory.java:359)
	  at org.springframework.beans.factory.support.AbstractBeanFactory$$Lambda$523.791402620.getObject(Unknown Source:-1)
	  at org.springframework.cloud.context.scope.GenericScope$BeanLifecycleWrapper.getBean(GenericScope.java:389)
	  - locked <0x1c45> (a java.lang.String)
	  at org.springframework.cloud.context.scope.GenericScope.get(GenericScope.java:186)
	  at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:356)
	  at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:202)
	  at org.springframework.aop.target.SimpleBeanTargetSource.getTarget(SimpleBeanTargetSource.java:35)
	  at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:673)
	  at brave.sampler.Sampler$$EnhancerBySpringCGLIB$$79d02789.isSampled(<generated>:-1)
	  at brave.Tracer.decorateContext(Tracer.java:268)
	  at brave.Tracer.newRootContext(Tracer.java:204)
	  at brave.Tracer.newTrace(Tracer.java:151)
	  at brave.Tracer.nextSpan(Tracer.java:470)
	  at io.lettuce.core.tracing.BraveTracing$BraveTracer.nextSpan(BraveTracing.java:259)
	  at io.lettuce.core.tracing.BraveTracing$BraveTracer.nextSpan(BraveTracing.java:266)
	  at io.lettuce.core.protocol.CommandHandler.writeSingleCommand(CommandHandler.java:392)
	  at io.lettuce.core.protocol.CommandHandler.write(CommandHandler.java:353)
	  at io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:716)
	  at io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:708)
	  at io.netty.channel.AbstractChannelHandlerContext.access$1700(AbstractChannelHandlerContext.java:56)
	  at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.write(AbstractChannelHandlerContext.java:1102)
	  at io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write(AbstractChannelHandlerContext.java:1149)
	  at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(AbstractChannelHandlerContext.java:1073)
	  at io.netty.util.concurrent.AbstractEventExecutor.safeExecute$$$capture(AbstractEventExecutor.java:163)
	  at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:-1)
	  at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:510)
	  at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:518)
	  at io.netty.util.concurrent.SingleThreadEventExecutor$6.run(SingleThreadEventExecutor.java:1050)
	  at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	  at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	  at java.lang.Thread.run(Thread.java:830)

@WEIZIBIN
Copy link
Author

Thanks, I have reached the same conclusion after using the jstack command. Has this issue been resolved now? @mp911de

@WEIZIBIN
Copy link
Author

I upgraded to Hoxton.SR1 and still have the same problem. @marcingrzejszczak

@Comven
Copy link

Comven commented Feb 12, 2020

I upgraded to Hoxton.SR1 and still have the same problem. @marcingrzejszczak

spring-boot-2.2.4+spring-cloud Hoxton.SR1 have the same problem

@WEIZIBIN
Copy link
Author

My current solution is to execute it before redis executes the command.
tracing.tracer().nextSpan ();
Then [Main] thread initializes Tracer first, so when redis executes the command, the IO thread will not block at getSingleton().

Is there a better solution?

@WEIZIBIN WEIZIBIN changed the title spring redis get connection block when include spring-cloud-starter-zipkin spring lettuce startup get blocks when including spring-cloud-starter-zipkin Feb 12, 2020
@marcingrzejszczak
Copy link
Contributor

Hey @WEIZIBIN , this is an extremely interesting case. I had a chat with @mp911de and we have the following conclusions:

The problem is here in SamplerAutoConfiguration in usage of RefreshScope.

@Configuration(proxyBeanMethods = false)
	@ConditionalOnBean(
			type = "org.springframework.cloud.context.scope.refresh.RefreshScope")
	protected static class RefreshScopedSamplerConfiguration {

		@Bean
		@RefreshScope
		@ConditionalOnMissingBean
		public Sampler defaultTraceSampler(SamplerProperties config) {
			return samplerFromProps(config);
		}

	}

Since we're using RefreshScope around the sampler bean and according to @mp911de the proxy is not lazy, we have an issue that there's a lock on the bean.

In order for you to have a workaround, you can use what you mentioned or create your own sampler bean e.g. using the always sampler if that's ok for you.

@Bean
	public Sampler mySampler() {
		return Sampler.ALWAYS_SAMPLE;
	}

Now I guess we'd need to chat with @dsyer and @spencergibb (cc @ryanjbaxter, @OlgaMaciaszek ) about those refresh scope proxies and their usage in @PostConstruct annotated methods.

@Comven
Copy link

Comven commented Apr 24, 2020

Is the bug fixed?

@marcingrzejszczak
Copy link
Contributor

Pinging @dsyer and @spencergibb (cc @ryanjbaxter, @OlgaMaciaszek ) about those refresh scope proxies and their usage in @PostConstruct annotated methods. Any ideas how to fix this?

@marcingrzejszczak
Copy link
Contributor

The suggested solution to this problem is

By default samplers will work with the refresh scope mechanism. That means that you can change the sampling properties at runtime, refresh the application and the changes will be reflected. However, sometimes the fact of creating a proxy around samplers and calling it from too early (from @PostConstruct annotated method) may lead to dead locks. In such a case either create a sampler bean explicitly, or set the property spring.sleuth.sampler.refresh.enabled to false to disable the refresh scope support.

Best solution would be not to use the redis stuff from @PostConstruct. If you have to do it, just set the sampler bean explicitly or wait for 3.0.0 release and set the spring.sleuth.sampler.refresh.enabled to false

@elvizlai
Copy link

+1

@sharath29
Copy link

I am getting similar error when trying to connect to redis from spring boot application, from the private static final StatefulRedisConnection<String, String> connection = redisClientNew.connect(); line.

@Component
@RequiredArgsConstructor
public class RedisCacheClient {

    private static final SpectraLogger logger = OlympusSpectra.getLogger(RedisCacheClient.class);
    private static final RedisURI redisURI = RedisURI.builder().withSentinel("localhost", 26379).withSentinelMasterId("mymaster").build();
    private static final RedisClient redisClientNew = RedisClient.create(redisURI);
    private static final StatefulRedisConnection<String, String> connection = redisClientNew.connect();
    private static final RedisAsyncCommands<String, String> async = connection.async();
    private static final Gson gson = new Gson();
    private static final Long KEY_EXIST = 1L;

Spring boot console log stuck after these two messages:

2021-06-19 16:02:44,704 8253 [main] INFO  [i.l.c.EpollProvider] {{clusterName,default}{appResourceId,ss@dev}} - Starting without optional epoll library 
2021-06-19 16:02:44,707 8256 [main] INFO  [i.l.c.KqueueProvider] {{clusterName,default}{appResourceId,ss@dev}} - Starting without optional kqueue library 

The error is coming from io.lettuce.core.AbstractRedisClient return connectionFuture.get(); inside the try block.

    protected <T> T getConnection(ConnectionFuture<T> connectionFuture) {
        try {
            return connectionFuture.get();
        } catch (InterruptedException var3) {
            Thread.currentThread().interrupt();
            throw RedisConnectionException.create(connectionFuture.getRemoteAddress(), var3);
        } catch (Exception var4) {
            if (var4 instanceof ExecutionException) {
                throw RedisConnectionException.create(connectionFuture.getRemoteAddress(), var4.getCause());
            } else {
                throw RedisConnectionException.create(connectionFuture.getRemoteAddress(), var4);
            }
        }
    }

Please let me know if this is the same issue and how can i proceed further.

pom.xml

    <dependency>
      <groupId>io.lettuce</groupId>
      <artifactId>lettuce-core</artifactId>
    </dependency>
    <dependency>
      <groupId>org.springframework.boot</groupId>
      <artifactId>spring-boot-starter-data-redis</artifactId>
    </dependency>

@marcingrzejszczak
Copy link
Contributor

You have not provided which version of cloud, boot and sleuth you're using. Please update to the latest versions and try again

@sharath29
Copy link

You have not provided which version of cloud, boot and sleuth you're using. Please update to the latest versions and try again

I am not using these two in my project.

@marcingrzejszczak
Copy link
Contributor

So why are you commenting on a sleuth lettuce integration issue in a sleuth project that depends on boot is you're neither using sleuth nor boot?

@sharath29
Copy link

I have been stuck with this error and the only closest thing which i could find was this post, and the errors being faced here matches with mine since its coming from the exact same line in the lettuce module.

@spencergibb
Copy link
Member

We can only help with spring cloud sleuth integration with lettuce. You'll need to find help elsewhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants