-
Notifications
You must be signed in to change notification settings - Fork 3.6k
[Issue 8138][pulsar-client] Fix ConsumerImpl memory leaks #8160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Issue 8138][pulsar-client] Fix ConsumerImpl memory leaks #8160
Conversation
- call closeConsumerTasks whenever consumer is discarded
/pulsarbot run-failure-checks |
9532630
to
175f03e
Compare
- when the connection gets switched, the ConsumerImpl must be de-registered from the previous ClientCnx where it got registered previously - the connection gets switched with the current seek implementation that disconnects the consumer after a seek. - the new connection is picked randomly of the existing connections in the connection pool. Therefore reproducing the leak required setting connectionsPerBroker > 1. - attempt to cover also other cases where the ConsumerImpl instance wasn't de-registered properly. - whenever client.cleanupConsumer(this) is called, the de-registration should be made - fix a bug in closeAsync where cleanup was skipped when the channel was closed or an exception occurred in sending the close consumer request
175f03e
to
0a706fb
Compare
I have revisited the original PR I made earlier today. The changes are now ready for review. It seems that the current model used to handle state and state transitions is error prone. The code gets duplicated in several locations and it's hard to be sure if it's fine to unify the behavior in all cases. I don't have test coverage for the changes I made. It seems that it would be possible to model consumer's state machine and test that it behaves in the correct way. However the barrier for such improvements is fairly large. :) |
/pulsarbot run-failure-checks |
The CI checks have passed now. @jiazhai Would you be able to review and provide feedback? Thank you |
Fixes apache#8138 ### Motivation The issue apache#8138 is about multiple memory leaks happening when using the Pulsar Java Client Reader API. One of the leaks was previously fixed by apache#8149 . This PR fixes two remaining ConsumerImpl memory leak issues: - leaks of ConsumerImpl instances via active timers (unAckedMessageTracker, acknowledgmentsGroupingTracker, stats) - leaks of ConsumerImpl instances via ClientCnx when the consumer gets closed by the Broker and the consumer picks a new connection to use ### Modifications - modifications to fix the leaks of ConsumerImpl via active timers: - calls to "closeConsumerTasks" were added in the locations where they were missing - modifications to fix the leaks of ConsumerImpl via ClientCnx - a solution was added to track to which ClientCnx the ConsumerImpl has been registered. When the ConsumerImpl gets registered to a new ClientCnx, the same instance is unregistered from the previous ClientCnx it was previously registered to. - when the connection gets switched, the ConsumerImpl must be de-registered from the previous ClientCnx where it got registered previously - the connection gets switched with the current seek implementation that disconnects the consumer after a seek. - the new connection is picked randomly of the existing connections in the connection pool. Therefore reproducing the leak required setting connectionsPerBroker > 1. - attempt to cover also other cases where the ConsumerImpl instance wasn't de-registered properly. - whenever client.cleanupConsumer(this) is called, the de-registration from ClientCnx (where the ConsumerImpl was previously registered) should be made - fix a bug in closeAsync where cleanup was skipped when the channel was closed or an exception occurred in sending the close consumer request ### Verifying this change - this PR doesn't include new tests. There is a manual test in lhotari@aff700c that has been used to reproduce the memory leak and to verify that no ConsumerImpl references are leaked after the fix has been applied.
Fixes apache#8138 ### Motivation The issue apache#8138 is about multiple memory leaks happening when using the Pulsar Java Client Reader API. One of the leaks was previously fixed by apache#8149 . This PR fixes two remaining ConsumerImpl memory leak issues: - leaks of ConsumerImpl instances via active timers (unAckedMessageTracker, acknowledgmentsGroupingTracker, stats) - leaks of ConsumerImpl instances via ClientCnx when the consumer gets closed by the Broker and the consumer picks a new connection to use ### Modifications - modifications to fix the leaks of ConsumerImpl via active timers: - calls to "closeConsumerTasks" were added in the locations where they were missing - modifications to fix the leaks of ConsumerImpl via ClientCnx - a solution was added to track to which ClientCnx the ConsumerImpl has been registered. When the ConsumerImpl gets registered to a new ClientCnx, the same instance is unregistered from the previous ClientCnx it was previously registered to. - when the connection gets switched, the ConsumerImpl must be de-registered from the previous ClientCnx where it got registered previously - the connection gets switched with the current seek implementation that disconnects the consumer after a seek. - the new connection is picked randomly of the existing connections in the connection pool. Therefore reproducing the leak required setting connectionsPerBroker > 1. - attempt to cover also other cases where the ConsumerImpl instance wasn't de-registered properly. - whenever client.cleanupConsumer(this) is called, the de-registration from ClientCnx (where the ConsumerImpl was previously registered) should be made - fix a bug in closeAsync where cleanup was skipped when the channel was closed or an exception occurred in sending the close consumer request ### Verifying this change - this PR doesn't include new tests. There is a manual test in lhotari@aff700c that has been used to reproduce the memory leak and to verify that no ConsumerImpl references are leaked after the fix has been applied.
Fixes #8138
Motivation
The issue #8138 is about multiple memory leaks happening when using the Pulsar Java Client Reader API.
One of the leaks was previously fixed by #8149 .
This PR fixes two remaining ConsumerImpl memory leak issues:
Modifications
modifications to fix the leaks of ConsumerImpl via active timers:
modifications to fix the leaks of ConsumerImpl via ClientCnx
a solution was added to track to which ClientCnx the ConsumerImpl has been registered. When the ConsumerImpl gets registered to a new ClientCnx, the same instance is unregistered from the previous ClientCnx it was previously registered to.
when the connection gets switched, the ConsumerImpl must be
de-registered from the previous ClientCnx where it got registered
previously
that disconnects the consumer after a seek.
in the connection pool. Therefore reproducing the leak required
setting connectionsPerBroker > 1.
attempt to cover also other cases where the ConsumerImpl instance
wasn't de-registered properly.
de-registration from ClientCnx (where the ConsumerImpl was previously registered) should be made
was closed or an exception occurred in sending the close consumer
request
Verifying this change