Issue when using kubeadm with multiple network interfaces #33618

Closed

Issue when using kubeadm with multiple network interfaces#33618

Labels

Is this a request for help? (If yes, you should use our troubleshooting guide and community support channels, see http://kubernetes.io/docs/troubleshooting/.): No

What keywords did you search in Kubernetes issues before filing this one? (If you have found any duplicates, you should instead reply there.): fialed

Is this a BUG REPORT or FEATURE REQUEST? (choose one): BUG REPORT

Kubernetes version (use kubectl version): Client Version: version.Info{Major:"1", Minor:"4", GitVersion:"v1.4.0", GitCommit:"a16c0a7f71a6f93c7e0f222d961f4675cd97a46b", GitTreeState:"clean", BuildDate:"2016-09-26T18:16:57Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}

Environment:

Cloud provider or hardware configuration: local VirtualBox machines managed with Vagrant
OS (e.g. from /etc/os-release): Ubuntu 16.04 (both host and VMs)
Kernel (e.g. uname -a): Linux kubenode01 4.4.0-38-generic Remove unnecessary application/json properties #57-Ubuntu SMP Tue Sep 6 15:42:33 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
Install tools: kubeadm
Others: Vagrant

What happened: I'm testing a Kubernetes cluster with VirtualBox and Vagrant. Every VM has an NATed interface (eth0) and a Host-Only interface (eth1). I wanted to make the first node try to join the master using the Host-Only IP address because Kubernetes is listening on all interfaces by default.

What you expected to happen: Since the master is listening to all interfaces available, I was expecting it to just work.

How to reproduce it (as minimally and precisely as possible): Setup two VMs, with one NATed interface, and one Host-Only interface. Follow steps from http://kubernetes.io/docs/getting-started-guides/kubeadm/ but instead of using exactly the kubeadm join ... command-line returned by the master, change the IP address to the one of the Host-Only interface.

Anything else do we need to know: You can use this to reproduce the issue: https://github.com/hgfischer/Kubernetes14

added

Member

No, kubernetes isn't listening to all interfaces by default. It picks the interface with the default gateway and listens to that. Use --api-advertise-addresses=<the eth1 ip addr> on kubeadm init in order to use the host-only interface.

cc @kubernetes/sig-cluster-lifecycle

errordeveloper

Member

To be clear, it does listen on all interfaces right now and we are planing to provide a flag to make it bind to a specific address, but default will probably remain 0.0.0.0 for now. Please also see #33562 (comment) and #33638.

lukemarsden

mentioned this

on Sep 28, 2016

Add instructions for overriding autodetected network interface. kubernetes/website#1334

lukemarsden

Contributor

Should be fixed in http://kubernetes.io/docs/getting-started-guides/kubeadm/#instructions now, @hgfischer could you please take a look and let me know if that docs change would have helped you out? Thanks!

danielschonfeld

Contributor

What about the nodes themselves, why don't they have a flag to pass on to kubelet the --address flag (via kubeadm join)? I have a bare metal installation with two ethernet devices, it just so happens to be that my internal network doesn't have the gateway on it. When installing weave-net the IP used to communicate between nodes ends up becoming my external IP and not the internal one.

Thanks!

lukemarsden

Contributor

Hey @danielschonfeld could you please open a ticket against https://github.com/weaveworks/weave-kube for the latter issue? I'll ask my colleagues to take a look. Thanks!

danielschonfeld

Contributor

@lukemarsden it's not a weave issue... weave only runs kube-peers which collects the advertised address from each node record from the API server. It's the kubelet on the individual nodes that does exactly what you described above. It takes the first IP with a gateway it finds and in my installation it just so happens to be that the first IP is the external one and not the internal one (cuz our internal network at our datacenter provider is with an unmanaged switch, meaning no gateway)

errordeveloper

Member

Ok, I see what's the problem. Node registration IP is outside of kubeadm.
Right now you will need to either add kubelet system drop-in unit, or
modify the one kubeadm install. Do know how to do this or you need more
details?

On Wed, 28 Sep 2016, 13:04 Daniel Schonfeld, notifications@github.com
wrote:

@lukemarsden https://github.com/lukemarsden it's not a weave issue...
weave only runs kube-peers which collects the advertised address from
each node record from the API server. It's the kubelet on the individual
nodes that does exactly what you described above. It takes the first IP
with a gateway it finds and in my installation it just so happens to be
that the first IP is the external one and not the internal one (cuz our
internal network at our datacenter provider is with an unmanaged switch,
meaning no gateway)

—
You are receiving this because you are on a team that was mentioned.
Reply to this email directly, view it on GitHub
#33618 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAPWSy4xxjNrWYNVmoQXx5fK_z5rIsJXks5qulfDgaJpZM4KIRBs
.

danielschonfeld

Contributor

@errordeveloper i'll need some details for the latter option. the former i'm familiar with, but i kinda like the latter idea better. Thank you!

added

and removed

Member

The Debian and RPM packages of kubeadm install a drop-in under
/etc/systemd/system/kubelet.service.d/10-kubeadm.conf, it has kubelet flags
needed for kubeadm to work. The file is the same on nodes and master. You
need to add a flag to this (IIRC --overide-external-hostname) and pass the
IP or hostname that you want, but check kubelet docs.

On Wed, 28 Sep 2016, 15:25 Daniel Schonfeld, notifications@github.com
wrote:

@errordeveloper https://github.com/errordeveloper i'll need some
details for the latter option. the former i'm familiar with, but i kinda
like the latter idea better. Thank you!

—
You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub
#33618 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAPWSxU8VaGwEHQ8OZaZf2SuU9ScQqWzks5quni9gaJpZM4KIRBs
.

danielschonfeld

Contributor

@errordeveloper thank you!! i'll go tinker with it and report back if I run into trouble

22 remaining items

errordeveloper

Member

@hgfischer this sounds like a more general bug (not so much related to kubeadm), could you please create another issue?

errordeveloper

Member

also does kubeadm join copy the 10-kubeadm.conf from the master on joining?

Missed this at first... The answer is - no, it doesn't need to go down to this level, in fact it's completely unaware of systemd as such. This file is installed by kubeadm package, and it's the same on master as well as all of the nodes.

errordeveloper

mentioned this

on Oct 5, 2016

kubeadm join specify advertise ip #34031

marccarre

mentioned this

on Feb 7, 2017

Pod isolation blocks all connections because of bad source IP weaveworks/weave#2756

SamMorrowDrums

mentioned this

on Feb 26, 2017

Kubeadm init forces me to open public port kubernetes/kubeadm#183

karthik101

Well i had the same issue in 1.7.1
Fixed it with:
kubeadm reset
kubeadm init --apiserver-advertise-address=10.10.10.4

Thank you luxas.

huawenli

mentioned this

on Sep 25, 2018

Give master/node specific IP address #69048

jboero

I know this thread is super long and pretty old but I figured I'd play my $0.02 for anyone else that comes along it. I also have many interfaces and struggle with kubeadm across various versions. I find that the best way to deploy my single-node for a local dev environment is to use a dummy interface. As you can't kubeadm on loopback (how annoying) and I've rarely been successful with default route, here is a tip for dummy dev. Note this is especially handy when using a laptop with only wifi as you can never expect to have consistent IP across access points and coffee shops and also can't use it when disconnected or on a plane without wifi:

$ sudo ip link add dummy0 type dummy

Or configure perm devices via here:
https://unix.stackexchange.com/questions/335284/how-can-we-create-multiple-dummy-interfaces-on-linux

Config a static IP on a dummy interface and use it for --apiserver-advertise-address. As there is nothing else messing with routes on that device, it's clean and ready to use locally. Super handy and sometimes bypasses issues with flannel/calico not getting routes from host (at least in my experience).

mwtzzz-zz

I'd like to add a comment that in order for me to get this to work (same issue as OP), it was necessary to add a route on the worker nodes:
ip route add 10.96.0.0/16 dev eth1 src <private network ip>
where "private network ip" is the ip assigned to the node's interface that's on the same network as the master node (likely a 172.28... address if you tend to use defaults in Vagrant).

The other solutions above were helpful in getting some but not all of the errors fixed. This ip route was the final piece needed.

kerren

Hi everyone,

Just to add to this, I was having problems getting my nodes to join the master node (this was not a cloud setup). I did an extra step which got everything working pretty well. I added an entry to my /etc/hosts file on the node itself that directed the hostname of the node to its IP address on the correct interface.

So for example, if your node's hostname is node-1.cluster.example.com and the IP address is 10.0.0.101 then you'd add the following line to your hosts file,

10.0.0.101 node-1.cluster.example.com

When you run the kubeadm join command, it should then use the correct interface.

amemni

mentioned this

on Dec 15, 2019

pod calico-node on worker nodes with 'CrashLoopBackOff' projectcalico/calico#2720

vaibhav-kaushal

@mwtzzz That solution worked, but now I am seeing multiple pods, running on different nodes with the same IP address. What might be wrong here?

Output of kubectl get pods -A -o wide:

NAMESPACE     NAME                               READY   STATUS    RESTARTS   AGE     IP          NODE       NOMINATED NODE   READINESS GATES
default       busybox-6cd57fd969-zr576           1/1     Running   0          3m16s   10.32.0.2   server-5   <none>           <none>
default       nginx-86c57db685-2zcjr             1/1     Running   0          11m     10.32.0.2   server-3   <none>           <none>
kube-system   coredns-6955765f44-gglvz           1/1     Running   5          3h31m   10.32.0.3   server-1   <none>           <none>
kube-system   coredns-6955765f44-nc8ff           1/1     Running   5          3h31m   10.32.0.2   server-1   <none>           <none>
kube-system   etcd-server-1                      1/1     Running   5          3h31m   10.0.2.15   server-1   <none>           <none>
kube-system   kube-apiserver-server-1            1/1     Running   5          3h31m   10.0.2.15   server-1   <none>           <none>
kube-system   kube-controller-manager-server-1   1/1     Running   5          3h31m   10.0.2.15   server-1   <none>           <none>
kube-system   kube-proxy-8mnlb                   1/1     Running   0          55m     10.0.2.15   server-5   <none>           <none>
kube-system   kube-proxy-9hpcr                   1/1     Running   1          55m     10.0.2.15   server-2   <none>           <none>
kube-system   kube-proxy-q2np4                   1/1     Running   0          55m     10.0.2.15   server-4   <none>           <none>
kube-system   kube-proxy-vw68k                   1/1     Running   1          56m     10.0.2.15   server-3   <none>           <none>
kube-system   kube-proxy-wjpbb                   1/1     Running   1          56m     10.0.2.15   server-1   <none>           <none>
kube-system   kube-scheduler-server-1            1/1     Running   5          3h31m   10.0.2.15   server-1   <none>           <none>
kube-system   weave-net-2jttm                    2/2     Running   28         158m    10.0.2.15   server-2   <none>           <none>
kube-system   weave-net-fhs59                    2/2     Running   27         158m    10.0.2.15   server-3   <none>           <none>
kube-system   weave-net-g67sp                    2/2     Running   9          158m    10.0.2.15   server-1   <none>           <none>
kube-system   weave-net-jqz69                    2/2     Running   30         158m    10.0.2.15   server-5   <none>           <none>
kube-system   weave-net-qc4ns                    2/2     Running   25         143m    10.0.2.15   server-4   <none>           <none>

Notice the first 4 lines!

mwtzzz-zz

Hi Vaibhav, Unfortunately it has been a few months since I was in that job. I'm now at a different company where we use Openshift on a large scale. I wouldn't know what to tell you about your issue. Michael

…

On Wed, Jan 15, 2020, 10:28 PM Vaibhav Kaushal ***@***.***> wrote: @mwtzzz <https://github.com/mwtzzz> That solution worked, but now I am seeing multiple pods, running on different nodes with the same IP address. What might be wrong here? Output of kubectl get pods -A -o wide: NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES default busybox-6cd57fd969-zr576 1/1 Running 0 3m16s 10.32.0.2 server-5 <none> <none> default nginx-86c57db685-2zcjr 1/1 Running 0 11m 10.32.0.2 server-3 <none> <none> kube-system coredns-6955765f44-gglvz 1/1 Running 5 3h31m 10.32.0.3 server-1 <none> <none> kube-system coredns-6955765f44-nc8ff 1/1 Running 5 3h31m 10.32.0.2 server-1 <none> <none> kube-system etcd-server-1 1/1 Running 5 3h31m 10.0.2.15 server-1 <none> <none> kube-system kube-apiserver-server-1 1/1 Running 5 3h31m 10.0.2.15 server-1 <none> <none> kube-system kube-controller-manager-server-1 1/1 Running 5 3h31m 10.0.2.15 server-1 <none> <none> kube-system kube-proxy-8mnlb 1/1 Running 0 55m 10.0.2.15 server-5 <none> <none> kube-system kube-proxy-9hpcr 1/1 Running 1 55m 10.0.2.15 server-2 <none> <none> kube-system kube-proxy-q2np4 1/1 Running 0 55m 10.0.2.15 server-4 <none> <none> kube-system kube-proxy-vw68k 1/1 Running 1 56m 10.0.2.15 server-3 <none> <none> kube-system kube-proxy-wjpbb 1/1 Running 1 56m 10.0.2.15 server-1 <none> <none> kube-system kube-scheduler-server-1 1/1 Running 5 3h31m 10.0.2.15 server-1 <none> <none> kube-system weave-net-2jttm 2/2 Running 28 158m 10.0.2.15 server-2 <none> <none> kube-system weave-net-fhs59 2/2 Running 27 158m 10.0.2.15 server-3 <none> <none> kube-system weave-net-g67sp 2/2 Running 9 158m 10.0.2.15 server-1 <none> <none> kube-system weave-net-jqz69 2/2 Running 30 158m 10.0.2.15 server-5 <none> <none> kube-system weave-net-qc4ns 2/2 Running 25 143m 10.0.2.15 server-4 <none> <none> — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#33618?email_source=notifications&email_token=ABT4AIPTI2CAWH4ZL3OEZELQ57473A5CNFSM4CRBCBWKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJC54YY#issuecomment-575004259>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABT4AINORJG4FQICS343DM3Q57473ANCNFSM4CRBCBWA> .

zimmertr

Neither of the posted solutions are working for me with Packer/Vagrant and RHEL 7 using Kubeadm to bootstrap a v1.18.0 cluster.

Some network information:

LAN CIDR: 192.168.30.0/24
VPN CIDR: 172.27.0.0/16
Docker Engine CIDR: 192.168.65.0/24
Kubernetes Pod CIDR: 172.16.0.0/16
Kubernetes Service CIDR: 172.32.0.0/16
Calico IPV4 Pool CIDR: 172.16.0.0/16

The calico-node daemonset also has IP_AUTODETECTION_METHOD set to interface=eth1. This is because my Vagrant VMs come up with eth0 populated with 10.0.2.15/24. I'm not sure why. But eth0 has the proper IP Address supplied via Vagrant.

The calico-node pods running on each worker node are producing the following logs repeatedly:

[INFO][8] startup.go 365: Hit error connecting to datastore - retry error=Get https://172.32.0.1:443/api/v1/nodes/foo: dial tcp 172.32.0.1:443: connect: connection refused

I can curl that endpoint from the master node but not the worker nodes. Since they communicate via the LAN CIDR mentioned above: 192.168.30.0/24

Adding routes like @mwtzzz specified did not yield a working cluster unfortunately.

Any ideas?

to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

Labels

area/kubeadm

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Issue when using kubeadm with multiple network interfaces #33618

22 remaining items

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Issue when using kubeadm with multiple network interfaces #33618

Description

Activity

luxas commented on Sep 28, 2016

errordeveloper commented on Sep 28, 2016

lukemarsden commented on Sep 28, 2016

danielschonfeld commented on Sep 28, 2016

lukemarsden commented on Sep 28, 2016

danielschonfeld commented on Sep 28, 2016

errordeveloper commented on Sep 28, 2016

danielschonfeld commented on Sep 28, 2016

errordeveloper commented on Sep 28, 2016

danielschonfeld commented on Sep 28, 2016

22 remaining items

errordeveloper commented on Sep 30, 2016

errordeveloper commented on Oct 3, 2016

karthik101 commented on Jul 17, 2017

jboero commented on Apr 15, 2019

mwtzzz-zz commented on Jun 25, 2019

kerren commented on Jul 29, 2019

vaibhav-kaushal commented on Jan 16, 2020

mwtzzz-zz commented on Jan 16, 2020

zimmertr commented on Apr 2, 2020

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Issue actions