Skip to content

Consistency of kafka-indexing-services #5915

Closed
@zhangxinyu1

Description

@zhangxinyu1
Contributor

Recently, we meet a problem: if we set replica = 2, there may be a big difference
between the results of a same query returned from theses two tasks, when one consumes much lagging than the other. Our users were very confused when the got the results, because, they found they earned less than before.
Should we solve the problem?

Activity

jihoonson

jihoonson commented on Jun 28, 2018

@jihoonson
Contributor

Hi @zhangxinyu1. Druid randomly chooses one of the replicas by default, and there is no way to handle this issue currently. The assumption behind the current implementation is that a slight out of sync can be allowed. However, it seems that it's not in the case of your users. Would you elaborate more on details for your application? We can add something for those applications if it's reasonable.

gianm

gianm commented on Jun 28, 2018

@gianm
Contributor

One idea might be to have a concept of a "primary" replica that has a higher priority than the non-primary one. So, brokers could prefer it. I think you wouldn't want this by default (it would hurt the ability to load balance between replicas) but it could be a nice option for people that care more about consistency than load balancing.

zhangxinyu1

zhangxinyu1 commented on Jul 3, 2018

@zhangxinyu1
ContributorAuthor

@jihoonson For examples, some users want to get the ad clicks, the clicks number is expected increasing. However, they may get a smaller number than before for the query is posted to another peon. They will doubt there is a problem with our statistics.

zhangxinyu1

zhangxinyu1 commented on Jul 3, 2018

@zhangxinyu1
ContributorAuthor

@gianm It's a good idea! We also should record the replicas' offset. Once the primary is shutdown, we should choose another primary replica., and we should ensure the new primary replica's offset is larger.

peferron

peferron commented on Jul 3, 2018

@peferron
Contributor

@jihoonson The assumption that a slight out of sync is acceptable does not cover task failures, right? For example, if a host experiences a hardware failure, tasks on this host will fail, and new tasks will start on another host from the last committed offsets. Until they catch up, these new tasks may serve data that is out of sync by dozens of minutes or even hours, depending on how old the last committed offsets were (which depends on taskDuration and intermediateHandoffPeriod, but setting these too low will create small segments during normal operation).

Ideally there would be a way to define some threshold over which replicas are considered out-of-sync, and have brokers route queries to in-sync replicas only. This would make replicas help with resiliency and data availability, in addition to helping with performance (which seems to be their primary use right now).

gianm

gianm commented on Jul 5, 2018

@gianm
Contributor

@peferron That's true, if you have two replicas and one fails, then the replacement for the failed replica will take some time to catch up. During that time it'll be more out of sync than normal.

I think you could get most of the way there, consistency wise, with the "primary" replica concept. The idea would be that whenever you create a new replica, it should announce itself at a lower server priority than any existing replica (maybe just subtract 1 every time you make a replica). The assumption is that older replicas will generally have more data, and so you would prefer the primary to be the oldest one at any given point. See io.druid.client.DruidServer for what I'm referring to by "priority".

Your idea of an "in-sync" set would be fancier, presumably taking into account the actual offsets that the tasks are at, and doing something that directs queries to the entire in-sync set rather than to a single primary replica. I can't think of a way that it could piggy back off something we already have so it may end up being more of a new system. It should probably be mediated by the supervisor somehow, which is the thing that is in charge of keeping track of task offsets. I'm not sure what the best way would be to get that info to the broker.

Anyone interested in working on one of those ideas? :)

stale

stale commented on Jun 21, 2019

@stale

This issue has been marked as stale due to 280 days of inactivity. It will be closed in 2 weeks if no further activity occurs. If this issue is still relevant, please simply write any comment. Even if closed, you can still revive the issue at any time or discuss it on the dev@druid.apache.org list. Thank you for your contributions.

reopened this on Aug 11, 2019
stale

stale commented on Aug 11, 2019

@stale

This issue is no longer marked as stale.

stale

stale commented on May 17, 2020

@stale

This issue has been marked as stale due to 280 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If this issue is still relevant, please simply write any comment. Even if closed, you can still revive the issue at any time or discuss it on the dev@druid.apache.org list. Thank you for your contributions.

stale

stale commented on Jun 14, 2020

@stale

This issue has been closed due to lack of activity. If you think that is incorrect, or the issue requires additional review, you can revive the issue at any time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @gianm@jihoonson@peferron@zhangxinyu1

        Issue actions

          Consistency of kafka-indexing-services · Issue #5915 · apache/druid