Closed
Description
Recently, we meet a problem: if we set replica = 2, there may be a big difference
between the results of a same query returned from theses two tasks, when one consumes much lagging than the other. Our users were very confused when the got the results, because, they found they earned less than before.
Should we solve the problem?
Activity
jihoonson commentedon Jun 28, 2018
Hi @zhangxinyu1. Druid randomly chooses one of the replicas by default, and there is no way to handle this issue currently. The assumption behind the current implementation is that a slight out of sync can be allowed. However, it seems that it's not in the case of your users. Would you elaborate more on details for your application? We can add something for those applications if it's reasonable.
gianm commentedon Jun 28, 2018
One idea might be to have a concept of a "primary" replica that has a higher priority than the non-primary one. So, brokers could prefer it. I think you wouldn't want this by default (it would hurt the ability to load balance between replicas) but it could be a nice option for people that care more about consistency than load balancing.
zhangxinyu1 commentedon Jul 3, 2018
@jihoonson For examples, some users want to get the ad clicks, the clicks number is expected increasing. However, they may get a smaller number than before for the query is posted to another peon. They will doubt there is a problem with our statistics.
zhangxinyu1 commentedon Jul 3, 2018
@gianm It's a good idea! We also should record the replicas' offset. Once the primary is shutdown, we should choose another primary replica., and we should ensure the new primary replica's offset is larger.
peferron commentedon Jul 3, 2018
@jihoonson The assumption that a slight out of sync is acceptable does not cover task failures, right? For example, if a host experiences a hardware failure, tasks on this host will fail, and new tasks will start on another host from the last committed offsets. Until they catch up, these new tasks may serve data that is out of sync by dozens of minutes or even hours, depending on how old the last committed offsets were (which depends on taskDuration and intermediateHandoffPeriod, but setting these too low will create small segments during normal operation).
Ideally there would be a way to define some threshold over which replicas are considered out-of-sync, and have brokers route queries to in-sync replicas only. This would make replicas help with resiliency and data availability, in addition to helping with performance (which seems to be their primary use right now).
gianm commentedon Jul 5, 2018
@peferron That's true, if you have two replicas and one fails, then the replacement for the failed replica will take some time to catch up. During that time it'll be more out of sync than normal.
I think you could get most of the way there, consistency wise, with the "primary" replica concept. The idea would be that whenever you create a new replica, it should announce itself at a lower server priority than any existing replica (maybe just subtract 1 every time you make a replica). The assumption is that older replicas will generally have more data, and so you would prefer the primary to be the oldest one at any given point. See io.druid.client.DruidServer for what I'm referring to by "priority".
Your idea of an "in-sync" set would be fancier, presumably taking into account the actual offsets that the tasks are at, and doing something that directs queries to the entire in-sync set rather than to a single primary replica. I can't think of a way that it could piggy back off something we already have so it may end up being more of a new system. It should probably be mediated by the supervisor somehow, which is the thing that is in charge of keeping track of task offsets. I'm not sure what the best way would be to get that info to the broker.
Anyone interested in working on one of those ideas? :)
stale commentedon Jun 21, 2019
This issue has been marked as stale due to 280 days of inactivity. It will be closed in 2 weeks if no further activity occurs. If this issue is still relevant, please simply write any comment. Even if closed, you can still revive the issue at any time or discuss it on the dev@druid.apache.org list. Thank you for your contributions.
stale commentedon Aug 11, 2019
This issue is no longer marked as stale.
stale commentedon May 17, 2020
This issue has been marked as stale due to 280 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If this issue is still relevant, please simply write any comment. Even if closed, you can still revive the issue at any time or discuss it on the dev@druid.apache.org list. Thank you for your contributions.
stale commentedon Jun 14, 2020
This issue has been closed due to lack of activity. If you think that is incorrect, or the issue requires additional review, you can revive the issue at any time.