Skip to content

Consumer rebalance problem when using docker container #667

Closed
@vancefantasy

Description

@vancefantasy

BUG REPORT

  1. Please describe the issue you observed:

docker环境下,2个不同的consumer实例,生成的clientId是重复的,进而导致rebalance的结果混乱。(经过debug发现获取到的ip是一样的,怀疑跟rancher网络有关)

  1. Please tell us about your environment:

rocketmq client "4.3.0"
java version "1.8.0_181"
Docker version "18.09.0"

  1. Other information (e.g. detailed explanation, logs, related issues, suggestions how to fix, etc):

建议:改进ClientConfig类中instanceName的生成规则

Activity

changed the title [-]在docker环境下,consumer端rebalance发生混乱[/-] [+]Consumer rebalance problem when using docker container[/+] on Jan 9, 2019
huanwei

huanwei commented on Jan 9, 2019

@huanwei

What's the current clientId generation rule in ClientConfig class? And how to reproduce it?

vancefantasy

vancefantasy commented on Jan 12, 2019

@vancefantasy
Author

@huanwei

clientId:

 public String buildMQClientId() {
    StringBuilder sb = new StringBuilder();
    sb.append(this.getClientIP());
    sb.append("@");
    sb.append(this.getInstanceName());
    if (!UtilAll.isBlank(this.unitName)) {
        sb.append("@");
        sb.append(this.unitName);
    }
    return sb.toString();
}

instanceName:

private String instanceName = System.getProperty("rocketmq.client.name", "DEFAULT");

public void changeInstanceNameToPID() {
    if (this.instanceName.equals("DEFAULT")) {
        this.instanceName = String.valueOf(UtilAll.getPid());
    }
}

在docker环境下(使用host模式),获取到的clientIp和pid都是一样的。当然clientIp一样的确切原因还没完全确认,可能和引入Rancher有关

wx20190112-180557 2x

#668

caigy

caigy commented on May 21, 2019

@caigy
Contributor

I encountered the same problem. Under HOST networking mode, all dockers managed by Rancher are with the same docker0 IP, which is 172.17.0.1. Thus, RemotingUtil.getLocalAddress() always returns "172.17.0.1" for those dockers, which leads to collision between consumers with the same pid, if using default "IP@pid" clientId.

  • RocketMQ client Version: 4.3.2
  • Rancher Version: 1.16.10
  • Docker Version: 18.09.5
vongosling

vongosling commented on May 22, 2019

@vongosling
Member

@huanwei This is a known problem in the docker container if we deploy mutl-docker containers in the one machine.

duhenglucky

duhenglucky commented on May 22, 2019

@duhenglucky
Contributor

@caigy @huanwei if used Rancher to manage container, all docker will always return "172.17.0.1" when getLocalAddress, not only in bridge model, but also in host model, so I think it may be an issue caused by Rancher, so you can temporarily solve this problem by setting the instanceName.

huanwei

huanwei commented on May 22, 2019

@huanwei

That's known issue caused by Rancher container network. Setting different instanceName should solve this problem, just as @duhenglucky suggested.

haycco

haycco commented on Jul 1, 2019

@haycco

Which version have been solved, I also meeting the same problem.

added this to the 4.6.0 milestone on Aug 8, 2019
modified the milestones: 4.6.0, 4.7.0 on Nov 11, 2019
modified the milestones: 4.7.0, 4.8.0 on Feb 28, 2020
maixiaohai

maixiaohai commented on Jan 19, 2021

@maixiaohai
Contributor

Has this be fixed?

Aaron-TangCode

Aaron-TangCode commented on Jan 7, 2022

@Aaron-TangCode
Contributor

除了通过手动自定义instanceName外,RocketMQ会做什么来兼容吗?还是不考虑兼容了?

Git-Yang

Git-Yang commented on Jan 8, 2022

@Git-Yang
Member

I would like a more general way to completely solve the problem of instance conflicts. #3680

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @vongosling@haycco@maixiaohai@huanwei@duhenglucky

        Issue actions

          Consumer rebalance problem when using docker container · Issue #667 · apache/rocketmq