Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consistency of kafka-indexing-services #5915

Closed
zhangxinyu1 opened this issue Jun 28, 2018 · 10 comments
Closed

Consistency of kafka-indexing-services #5915

zhangxinyu1 opened this issue Jun 28, 2018 · 10 comments

Comments

@zhangxinyu1
Copy link
Contributor

Recently, we meet a problem: if we set replica = 2, there may be a big difference
between the results of a same query returned from theses two tasks, when one consumes much lagging than the other. Our users were very confused when the got the results, because, they found they earned less than before.
Should we solve the problem?

@jihoonson
Copy link
Contributor

Hi @zhangxinyu1. Druid randomly chooses one of the replicas by default, and there is no way to handle this issue currently. The assumption behind the current implementation is that a slight out of sync can be allowed. However, it seems that it's not in the case of your users. Would you elaborate more on details for your application? We can add something for those applications if it's reasonable.

@gianm
Copy link
Contributor

gianm commented Jun 28, 2018

One idea might be to have a concept of a "primary" replica that has a higher priority than the non-primary one. So, brokers could prefer it. I think you wouldn't want this by default (it would hurt the ability to load balance between replicas) but it could be a nice option for people that care more about consistency than load balancing.

@zhangxinyu1
Copy link
Contributor Author

zhangxinyu1 commented Jul 3, 2018

@jihoonson For examples, some users want to get the ad clicks, the clicks number is expected increasing. However, they may get a smaller number than before for the query is posted to another peon. They will doubt there is a problem with our statistics.

@zhangxinyu1
Copy link
Contributor Author

@gianm It's a good idea! We also should record the replicas' offset. Once the primary is shutdown, we should choose another primary replica., and we should ensure the new primary replica's offset is larger.

@peferron
Copy link
Contributor

peferron commented Jul 3, 2018

@jihoonson The assumption that a slight out of sync is acceptable does not cover task failures, right? For example, if a host experiences a hardware failure, tasks on this host will fail, and new tasks will start on another host from the last committed offsets. Until they catch up, these new tasks may serve data that is out of sync by dozens of minutes or even hours, depending on how old the last committed offsets were (which depends on taskDuration and intermediateHandoffPeriod, but setting these too low will create small segments during normal operation).

Ideally there would be a way to define some threshold over which replicas are considered out-of-sync, and have brokers route queries to in-sync replicas only. This would make replicas help with resiliency and data availability, in addition to helping with performance (which seems to be their primary use right now).

@gianm
Copy link
Contributor

gianm commented Jul 5, 2018

@peferron That's true, if you have two replicas and one fails, then the replacement for the failed replica will take some time to catch up. During that time it'll be more out of sync than normal.

I think you could get most of the way there, consistency wise, with the "primary" replica concept. The idea would be that whenever you create a new replica, it should announce itself at a lower server priority than any existing replica (maybe just subtract 1 every time you make a replica). The assumption is that older replicas will generally have more data, and so you would prefer the primary to be the oldest one at any given point. See io.druid.client.DruidServer for what I'm referring to by "priority".

Your idea of an "in-sync" set would be fancier, presumably taking into account the actual offsets that the tasks are at, and doing something that directs queries to the entire in-sync set rather than to a single primary replica. I can't think of a way that it could piggy back off something we already have so it may end up being more of a new system. It should probably be mediated by the supervisor somehow, which is the thing that is in charge of keeping track of task offsets. I'm not sure what the best way would be to get that info to the broker.

Anyone interested in working on one of those ideas? :)

@stale
Copy link

stale bot commented Jun 21, 2019

This issue has been marked as stale due to 280 days of inactivity. It will be closed in 2 weeks if no further activity occurs. If this issue is still relevant, please simply write any comment. Even if closed, you can still revive the issue at any time or discuss it on the dev@druid.apache.org list. Thank you for your contributions.

@stale stale bot added the stale label Jun 21, 2019
@gianm gianm reopened this Aug 11, 2019
@stale
Copy link

stale bot commented Aug 11, 2019

This issue is no longer marked as stale.

@stale stale bot removed the stale label Aug 11, 2019
@stale
Copy link

stale bot commented May 17, 2020

This issue has been marked as stale due to 280 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If this issue is still relevant, please simply write any comment. Even if closed, you can still revive the issue at any time or discuss it on the dev@druid.apache.org list. Thank you for your contributions.

@stale stale bot added the stale label May 17, 2020
@stale
Copy link

stale bot commented Jun 14, 2020

This issue has been closed due to lack of activity. If you think that is incorrect, or the issue requires additional review, you can revive the issue at any time.

@stale stale bot closed this as completed Jun 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants