Skip to content

Zuul post retry on 503 #892

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
yukatan opened this issue Mar 9, 2016 · 15 comments · Fixed by #2209
Closed

Zuul post retry on 503 #892

yukatan opened this issue Mar 9, 2016 · 15 comments · Fixed by #2209

Comments

@yukatan
Copy link

yukatan commented Mar 9, 2016

We have a service that returns 503 responses when some dependencies are not available. We are observing that in this situation zuul retry one time.

The problem is that the request is a POST with json body and the first time the body is correctly sent, but when zuul retries, the body is sent empty and it results in a bad request to the client. We are in Angel.SR3 but the response is the same with SR6

{
  "result": {
    "code": 400,
    "info": "Bad Request"
  },
  "errors": [
    {
      "code": "requestBodyInvalid",
      "description": "request body is invalid"
    }
  ]
}

Any idea of how to retry with body?? it seems a serius bug to me

Thanks in advance!!

@spencergibb
Copy link
Member

Can you show more of your configuration please? Retries are the responsibility of ribbon in zuul. By default it only retries GET.

@yukatan
Copy link
Author

yukatan commented Mar 9, 2016

Hi,

After some test I push a demo with the bug running here
Steps to reproduce

  • Run Eureka
  • Run Zuul
  • Launch Service503Test testRepeatUntilBadRequest() Class
  • Wait until service-fail is available in Eureka and Zuul.
  • Look at the test log until you get something similar to this:
2016-03-09 22:31:13.755  INFO 14706 --- [nio-8080-exec-1] org.yukatan.service.Application503       : Body arrived to the service: {"content":"hello"}

2016-03-09 22:31:13.817  INFO 14706 --- [           main] org.yukatan.service.Service503Test       : Response:<400 Bad Request,{"timestamp":1457559073768,"status":400,"error":"Bad Request","exception":"org.springframework.http.converter.HttpMessageNotReadableException","message":"Required request body content is missing: org.springframework.web.method.HandlerMethod$HandlerMethodParameter@113443","path":"/test"},{Server=[Apache-Coyote/1.1], X-Application-Context=[zuul:8000], Date=[Wed, 09 Mar 2016 21:31:13 GMT], Content-Type=[application/json;charset=UTF-8], Transfer-Encoding=[chunked], Connection=[close]}

2016-03-09 22:31:18.825  INFO 14706 --- [nio-8080-exec-3] org.yukatan.service.Application503       : Body arrived to the service: {"content":"hello"}

2016-03-09 22:31:18.834  INFO 14706 --- [           main] org.yukatan.service.Service503Test       : Response:<400 Bad Request,{"timestamp":1457559078832,"status":400,"error":"Bad Request","exception":"org.springframework.http.converter.HttpMessageNotReadableException","message":"Required request body content is missing: org.springframework.web.method.HandlerMethod$HandlerMethodParameter@113443","path":"/test"},{Server=[Apache-Coyote/1.1], X-Application-Context=[zuul:8000], Date=[Wed, 09 Mar 2016 21:31:18 GMT], Content-Type=[application/json;charset=UTF-8], Transfer-Encoding=[chunked], Connection=[close]}>

2016-03-09 22:31:23.842  INFO 14706 --- [nio-8080-exec-5] org.yukatan.service.Application503       : Body arrived to the service: {"content":"hello"}

2016-03-09 22:31:23.850  INFO 14706 --- [           main] org.yukatan.service.Service503Test       : Response:<400 Bad Request,{"timestamp":1457559083848,"status":400,"error":"Bad Request","exception":"org.springframework.http.converter.HttpMessageNotReadableException","message":"Required request body content is missing: org.springframework.web.method.HandlerMethod$HandlerMethodParameter@113443","path":"/test"},{Server=[Apache-Coyote/1.1], X-Application-Context=[zuul:8000], Date=[Wed, 09 Mar 2016 21:31:23 GMT], Content-Type=[application/json;charset=UTF-8], Transfer-Encoding=[chunked], Connection=[close]}>

2016-03-09 22:31:28.858  INFO 14706 --- [nio-8080-exec-7] org.yukatan.service.Application503       : Body arrived to the service: {"content":"hello"}

2016-03-09 22:31:28.867  INFO 14706 --- [           main] org.yukatan.service.Service503Test       : Response:<400 Bad Request,{"timestamp":1457559088864,"status":400,"error":"Bad Request","exception":"org.springframework.http.converter.HttpMessageNotReadableException","message":"Required request body content is missing: org.springframework.web.method.HandlerMethod$HandlerMethodParameter@113443","path":"/test"},{Server=[Apache-Coyote/1.1], X-Application-Context=[zuul:8000], Date=[Wed, 09 Mar 2016 21:31:28 GMT], Content-Type=[application/json;charset=UTF-8], Transfer-Encoding=[chunked], Connection=[close]}>

As you can see, post request is retried in the same second. The first one with body the second one empty

I hope this helps.

@yukatan
Copy link
Author

yukatan commented Mar 23, 2016

Hi,
Any advance on this?

@spencergibb
Copy link
Member

@yukatan no update.

@dorkolog
Copy link

dorkolog commented Apr 4, 2016

@yukatan - feels like a dup of Netflix/zuul#193

Please take a look for the suggested workaround

@spencergibb
Copy link
Member

@yukatan did you follow @dorkolog's suggestion?

@pradeepkusingh
Copy link

Hi,
I have implemented the suggestion form another thread. is ther any plan to fix it in Sping cloud zuul/ribbon itself ?

@ryanjbaxter
Copy link
Contributor

@pradeepkusingh we will no longer use the Ribbon RestClient in Camden so this issue is not particularly valid anymore for the latest releases. Please keep all retry questions in #1290.

@andrehueppe
Copy link

andrehueppe commented Aug 3, 2017

Hi @ryanjbaxter , @spencergibb

are there any plans to fix this bug in the near future?
We ran into this problem as well and fixed it with help of the workaround mentioned in netflix/zuul#193

Of course, the workaround is outdated but we were able to adopt it into the current code (Dalson.RELEASE with Spring-Retry and spring-cloud-commons >= 1.2.3.RELEASE / spring-cloud-netflix-core >= 1.3.2.RELEASE for the 'RetryOnStatusCode feature')

This means that we have a filter which replaces the request entity on the zuul request context and a modified version of the RetryableRibbonLoadBalancingHttpClient which can reset the request's input stream which was set through the ribbon request context.

I was wondering why it is not adopted in spring-cloud already. Maybe the FormBodyWrapperFilter can be modified to run on any request with a payload in general (POST/PUT/PATCH, I think?) and the FormBodyRequestWrapper should be used to replace the requests's input stream with a resettable one.

What do you think?
Best regards,
Andre

@ryanjbaxter
Copy link
Contributor

@doernbrackandre its not on our radar at the moment, but if you would like to submit a PR with the necessary changes that would be great!

@germm
Copy link

germm commented Aug 9, 2017

I experienced the same issue with Dalston.SR2. The OkToRetryOnAllOperations option does not work for Requests with a body. The body is not submitted to the target for the retry.

Applying the workaround suggested in (Netflix/zuul#193) seams to be dangerous for large requests because all request bodies are buffered in memory.

@spencergibb
Copy link
Member

There's no other way, but to buffer the body. How else would it retry with a body?

@germm
Copy link

germm commented Aug 9, 2017

It could be implemented in a way that the requests are only buffer up to a certain body size. If the limit is exceeded, a retry would not be possible - which is IMHO better as going out of memory even if retry is not necessary if a large request is passed.

ryanjbaxter pushed a commit that referenced this issue Sep 25, 2017
Modfied RibbonCommandContext to make request entity resettable. fixes gh-892
@ezraroi
Copy link

ezraroi commented Oct 20, 2017

Any recommendation for fixing this i Dalston Release train? Or should we not retry post requests

@ryanjbaxter
Copy link
Contributor

I dont see why not. @spencergibb do you see any reason why we can't backport this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants