Consistency of kafka-indexing-services #5915

Closed

Closed

Consistency of kafka-indexing-services#5915

Labels

Area - Streaming Ingestionstale

opened

on Jun 28, 2018

Contributor

Recently, we meet a problem: if we set replica = 2， there may be a big difference
between the results of a same query returned from theses two tasks, when one consumes much lagging than the other. Our users were very confused when the got the results, because, they found they earned less than before.
Should we solve the problem?

Contributor

Hi @zhangxinyu1. Druid randomly chooses one of the replicas by default, and there is no way to handle this issue currently. The assumption behind the current implementation is that a slight out of sync can be allowed. However, it seems that it's not in the case of your users. Would you elaborate more on details for your application? We can add something for those applications if it's reasonable.

added

Area - Streaming Ingestion

on Jun 28, 2018

Contributor

One idea might be to have a concept of a "primary" replica that has a higher priority than the non-primary one. So, brokers could prefer it. I think you wouldn't want this by default (it would hurt the ability to load balance between replicas) but it could be a nice option for people that care more about consistency than load balancing.

ContributorAuthor

@jihoonson For examples, some users want to get the ad clicks, the clicks number is expected increasing. However, they may get a smaller number than before for the query is posted to another peon. They will doubt there is a problem with our statistics.

ContributorAuthor

@gianm It's a good idea! We also should record the replicas' offset. Once the primary is shutdown, we should choose another primary replica., and we should ensure the new primary replica's offset is larger.

Contributor

@jihoonson The assumption that a slight out of sync is acceptable does not cover task failures, right? For example, if a host experiences a hardware failure, tasks on this host will fail, and new tasks will start on another host from the last committed offsets. Until they catch up, these new tasks may serve data that is out of sync by dozens of minutes or even hours, depending on how old the last committed offsets were (which depends on taskDuration and intermediateHandoffPeriod, but setting these too low will create small segments during normal operation).

Ideally there would be a way to define some threshold over which replicas are considered out-of-sync, and have brokers route queries to in-sync replicas only. This would make replicas help with resiliency and data availability, in addition to helping with performance (which seems to be their primary use right now).

Contributor

@peferron That's true, if you have two replicas and one fails, then the replacement for the failed replica will take some time to catch up. During that time it'll be more out of sync than normal.

I think you could get most of the way there, consistency wise, with the "primary" replica concept. The idea would be that whenever you create a new replica, it should announce itself at a lower server priority than any existing replica (maybe just subtract 1 every time you make a replica). The assumption is that older replicas will generally have more data, and so you would prefer the primary to be the oldest one at any given point. See io.druid.client.DruidServer for what I'm referring to by "priority".

Your idea of an "in-sync" set would be fancier, presumably taking into account the actual offsets that the tasks are at, and doing something that directs queries to the entire in-sync set rather than to a single primary replica. I can't think of a way that it could piggy back off something we already have so it may end up being more of a new system. It should probably be mediated by the supervisor somehow, which is the thing that is in charge of keeping track of task offsets. I'm not sure what the best way would be to get that info to the broker.

Anyone interested in working on one of those ideas? :)

mentioned this

add note on consistency of results for sys.segments queries #7034

This issue has been marked as stale due to 280 days of inactivity. It will be closed in 2 weeks if no further activity occurs. If this issue is still relevant, please simply write any comment. Even if closed, you can still revive the issue at any time or discuss it on the dev@druid.apache.org list. Thank you for your contributions.

added

on Jun 21, 2019

closed this as completed

on Jun 21, 2019

reopened this

on Aug 11, 2019

This issue is no longer marked as stale.

removed

on Aug 11, 2019

This issue has been marked as stale due to 280 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If this issue is still relevant, please simply write any comment. Even if closed, you can still revive the issue at any time or discuss it on the dev@druid.apache.org list. Thank you for your contributions.

added

on May 17, 2020

This issue has been closed due to lack of activity. If you think that is incorrect, or the issue requires additional review, you can revive the issue at any time.

closed this as completed

on Jun 14, 2020

to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

Labels

Area - Streaming Ingestionstale

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Participants