New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for max.poll.records consumer configuration #1653
Comments
There are two options:
|
Actually I am using the node wrapper https://github.com/Blizzard/node-rdkafka, and so can't directly use these options. Wouldn't it be useful to support this config so that all the wrappers get the functionality without modifying the code. |
This would require a new API since the current consumer_poll() API returns a single message, so it wouldn't automatically trickle down into the bindings. |
Correct me if I am wrong.. but you don't fetch single messages from kafka right? Becuase thats not possible to control without using max.poll.records. So it means you are fetching multiple messages, storing them, and only providing single messages in the API. So API change is not required, as this config only controls the rate of fetching from kafka, you can provide messages one at a time to the library user as before. |
Not sure I follow what you are asking for, do you want:
librdkafka pre-fetches messages from the broker into an internal queue which is then served by the application when it calls consumer_poll() (et.al.). |
I only need to control the rate of fetch from kafka. I was saying it might not require any changes in the API, since that is independent of the fetch rate. |
Can you explain your use-case in more detail? https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md |
Here's an example of how to implement a batch consume interface: |
Can't consumer->consume(remaining_timeout) give you more messages than batch_size ? queued.min.messages, queued.max.messages.kbytes, fetch.wait.max.ms allow us to limit the rate to a certain extent only, and are not a foolproof method to limit the fetch rate to say a 100 messages/sec especially if you have great variation in message sizes. Instead, a single conf max.poll.records can be used to guarantee that we don't fetch more messages from kafka than configured. I think this is a very valuable feature, lack of throttling capability is a major hinderance to kafka adoption in my org. Official java client already has this feature, but we are restricted to nodejs. |
Actually, there is no way to ask the broker for a maximum number of messages, only a maximum total size of messages, this means the Java consumer will also fetch more than Even with librdkafka's pre-fetching the fetch rate of the consumer will over time correspond to the consume rate of the application: as the internal fetchq fills up from pre-fetched messages the fetcher will stop fetching until the application has consumed enough messages to make the fetchq drop below the configured thresholds (be it queued.min.messages or queued.max.message.kbytes). |
@edenhill Is there any plan to provide support for Java's 'max.poll.records' property in c/c++ ? |
@choudhary001 You mean for the consume_batch() API? |
@edenhill sry, i was reading this issue because there was a question in an issue regarding adding batch consume to php. |
@nick-zh Yep, that sounds about right. |
This config allows consumers to control their rate of consumption, which would be very helpful in upstream throttling scenarios:
From the kafka documentation
max.poll.records: The maximum number of records returned in a single call to poll()
It has been available since kafka 0.11
Can this be added please?
The text was updated successfully, but these errors were encountered: