Plugin [loop] not work with systemd-resolved running #2087

mritd · 2018-09-06T13:19:37Z

When systemd-resolved is running, nameserver in /etc/resolved.conf default to 127.0.0.53;
The plugin loop detects two DNS query, and finally coredns fails to start.

Environment:

Ubuntu 18.04.1
Kubernetes 1.11.2
CoreDNS 1.2.2

Error log:

docker1.node ➜  kubectl logs coredns-55f86bf584-7sbtj -n kube-system
.:53
2018/09/06 13:02:45 [INFO] CoreDNS-1.2.2
2018/09/06 13:02:45 [INFO] linux/amd64, go1.11, eb51e8b
CoreDNS-1.2.2
linux/amd64, go1.11, eb51e8b
2018/09/06 13:02:45 [INFO] plugin/reload: Running configuration MD5 = 86e5222d14b17c8b907970f002198e96
2018/09/06 13:02:45 [FATAL] plugin/loop: Seen "HINFO IN 2050421060481615995.5620656063561519376." more than twice, loop detected

Deploy with deploy.sh

The text was updated successfully, but these errors were encountered:

chrisohaver · 2018-09-06T13:28:44Z

This is working as intended. The loop plugin has detected a forwarding loop, caused by systemd-resolved. If CoreDNS didn't exit, it would loop "forever" on the first upstream query it receives and get OOM killed.

The best fix is to add a flag to kubelet, to let it know that it should use the original resolv.conf....
--resolv-conf=/run/systemd/resolve/resolv.conf, then restart coredns pods

mritd · 2018-09-06T13:39:43Z

Thanks for your answer, this is a good idea. (I just solved it by stopping systemd-resolved, stupid me 😂).

miekg · 2018-09-08T16:43:13Z

/plugin: loop /label: question

avaikararkin · 2018-09-28T09:32:40Z

I am facing same issue:

[root@faas-cent1 ~]# kubectl logs coredns-7f4b9fccc6-6bg7s -n kube-system
.:53
2018/09/28 09:24:50 [INFO] CoreDNS-1.2.2
2018/09/28 09:24:50 [INFO] linux/amd64, go1.11, eb51e8b
CoreDNS-1.2.2
linux/amd64, go1.11, eb51e8b
2018/09/28 09:24:50 [INFO] plugin/reload: Running configuration MD5 = f65c4821c8a9b7b5eb30fa4fbc167769
2018/09/28 09:24:56 [FATAL] plugin/loop: Seen "HINFO IN 6010196033322906137.8653621564656081764." more than twice, loop detected

This is on centOS7 & no, my /etc/resolv.conf does not have a 127... entry.
It is this:

[root@faas-cent1 ~]# cat /etc/resolv.conf

Generated by NetworkManager

nameserver 10.148.20.5
[root@faas-cent1 ~]#

[root@faas-cent1 ~]# docker --version
Docker version 18.06.1-ce, build e68fc7a
[root@faas-cent1 ~]#

[root@faas-cent1 ~]# kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.0", GitCommit:"0ed33881dc4355495f623c6f22e7dd0b7632b7c0", GitTreeState:"clean", BuildDate:"2018-09-27T17:02:38Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"}
[root@faas-cent1 ~]#

[root@faas-cent1 ~]# uname -a
Linux faas-cent1 3.10.0-862.11.6.el7.x86_64 #1 SMP Tue Aug 14 21:49:04 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
[root@faas-cent1 ~]#

I don't have a file /run/systemd/resolve/resolv.conf on my system to try the workaround
dnsmasq seems to be running on the system though, would that be causing this issue?

Asisranjan · 2018-10-04T12:26:33Z

I am getting same error too.
.:53
2018/10/04 12:18:47 [INFO] CoreDNS-1.2.2
2018/10/04 12:18:47 [INFO] linux/amd64, go1.11, eb51e8b
CoreDNS-1.2.2
linux/amd64, go1.11, eb51e8b
2018/10/04 12:18:47 [INFO] plugin/reload: Running configuration MD5 = 486384b491cef6cb69c1f57a02087363
2018/10/04 12:18:53 [FATAL] plugin/loop: Seen "HINFO IN 7533478916006617590.6696743068873483726." more than twice, loop detected

chrisohaver · 2018-10-04T12:36:15Z

This is the loop detection detecting a loop, and exiting. This is the intended behavior, unless of course there is no loop.

If you doubt there is a loop, you may try removing the loop detection (remove loop from the coredns configuration), and then test DNS resolution from pods (i.e. test resolution to external domains from the command line of a pod running in the cluster).

johnbelamaric · 2018-10-04T16:30:18Z

Seems like the error message needs to be clearer. It should say something like this: Query "HINFO..." seen more than twice. This means a query resolution loop has been detected. CoreDNS will not run until this is resolved, or the loop plugin is removed from the configuration. Leaving a loop in place can cause random DNS resolution failures, crashes, and high CPU utilization. And perhaps even refer to the website to a page that explains some of the common cases we have seen.

…

On Thu, Oct 4, 2018 at 5:36 AM chrisohaver ***@***.***> wrote: This is the loop detection detecting a loop, and exiting. This is the intended behavior, unless of course there is no loop. If you doubt there is a loop, you may try removing the loop detection (remove loop from the coredns configuration), and then test DNS resolution from pods (i.e. test resolution to external domains from the command line of a pod running in the cluster). — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#2087 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AJB4s_kcb5rbvFNoo_KX4W5Qhl6woeYYks5uhgDIgaJpZM4Wc6xv> .

miekg · 2018-10-04T19:27:37Z

[ Quoting <notifications@github.com> in "Re: [coredns/coredns] Plugin [loop]..." ]

Seems like the error message needs to be clearer. It should say something like this: Query "HINFO..." seen more than twice. This means a query resolution loop has been detected. CoreDNS will not run until this is resolved, or the loop plugin is removed from the configuration. Leaving a loop in place can cause random DNS resolution failures, crashes, and high CPU utilization. And perhaps even refer to the website to a page that explains some of the common cases we have seen.

excellent plan, esp the link to a website

avaikararkin · 2018-10-08T05:18:18Z

In my case, it seemed to be the problem with IPv6, the VM I had created had IPv6 turned on by default and there was an entry for the same in /etc/resolv. I turned IPv6 off and removed the entries for ::1 and things seem to be working.

chrisohaver · 2018-10-12T17:57:49Z

Seems like the error message needs to be clearer. It should say something like ...

LOL - i just saw this now, after I submitted a PR for it.

johnbelamaric · 2018-10-12T18:11:07Z

No problem. @avaikararkin you could add details to the README Troubleshooting section...

ahalimkara · 2018-10-20T10:50:57Z

Removing loop plugin is worked for me, is there any side effect of removing loop from the coredns configuration?

If you doubt there is a loop, you may try removing the loop detection (remove loop from the coredns configuration), and then test DNS resolution from pods (i.e. test resolution to external domains from the command line of a pod running in the cluster).

miekg · 2018-10-20T20:33:23Z

No there isn't any effects. It only checks for loops.

…

On Sat, 20 Oct 2018, 11:51 ahalimkara, ***@***.***> wrote: Removing loop plugin is worked for me, is there any side effect of removing loop from the coredns configuration? If you doubt there is a loop, you may try removing the loop detection (remove loop from the coredns configuration), and then test DNS resolution from pods (i.e. test resolution to external domains from the command line of a pod running in the cluster). — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#2087 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAVkW22-5k2eaQOBRVoS9V7r6U82Eg3Oks5umwAUgaJpZM4Wc6xv> .

spitfire88 · 2018-10-23T19:26:53Z

remove loop from the coredns configuration

How do you do that?

chrisohaver · 2018-10-23T19:28:47Z

@spitfire88, remove loop from the Corefile (in k8s, the Corefile is in the coredns configmap)

chrisohaver · 2018-10-23T19:31:56Z

e.g.

kubectl -n kube-system edit configmap coredns

Then delete the line that says loop, and save the configuration. It can take several minutes for k8s to propagate the config change to the coredns pods.

zhuziying · 2018-11-07T05:23:12Z

hi,chrisohaver.what's meaning of loop in Corefile

chrisohaver · 2018-11-07T13:54:11Z

It detects forwarding loops.

https://github.com/coredns/coredns/blob/master/plugin/loop/README.md

zhuziying · 2018-11-08T00:36:41Z

3q，chrisohaver

SiddheshRane · 2018-11-12T12:48:35Z

I recently faced this problem. It is not specific to systemd-resolve. On Ubuntu 16.04 which does not have systemd-resolve, the resolve.conf contains localhost dns server.
My question is why don't we simply ignore any ip which points to localhost, like 127.0.0.1, ::1 etc.
Right now I need to use fragile hacks like pointing to /var/run/systemd/resolve/resolv.conf.

chrisohaver · 2018-11-12T13:22:06Z

@SiddheshRane, I think in 16.04, DNS is managed by NetworkManager, which can essentially do the same thing as systemd-resolved as it pertains to DNS; it can run a local DNS cache (dnsmasq).

Skipping over loopbacks such as 127.0.0.1 would not solve the larger problem because these configurations typically only contain a local address in /etc/resolv.conf. Skipping it would still result in non-functional DNS for upstream queries, because no upstream server would be configured. Functionally, the correct resolv.conf file to use the one that contains the actual upstream servers used by the host.

In the context of Kubernetes, the best fix is to properly configure kubelet, so it can pass the correct resolv.conf file to all Pods using the Default DNS policy.

bwillcox · 2018-11-19T20:40:30Z

I've tried the extra config with and without quotes on the parameter, and it prevents the kubelet from starting, i'm sure it's a newbie mistake and apologies if this isn't the right place for this
sudo minikube start --vm-driver=none --extra-config=kubelet.ResolverConfig="/var/run/systemd/resolve/resolv.conf"

chrisohaver · 2018-11-19T21:00:27Z

Probably best to ask in minikube repo, but ... that syntax seems correct , from what i just read.
Do kubelet logs reveal any hints?

bwillcox · 2018-11-19T21:33:41Z

this from syslog; looks like it's not being passed as expected (maybe my expectations, set by https://kubernetes.io/docs/setup/minikube/#quickstart, are incorrect)
Nov 19 16:10:53 ubuntu kubelet[16413]: F1119 16:10:53.060353 16413 server.go:145] unknown flag: --ResolverConfig

this gave me an idea to try this...
ubuntu % sudo minikube start --vm-driver=none --extra-config=kubelet.resolv-conf=/var/run/systemd/resolve/resolv.conf

and that seems to have worked; coredns and kube-dns now much happier

thanks for the nudge...

chrisohaver · 2018-11-19T21:47:56Z

maybe my expectations, set by https://kubernetes.io/docs/setup/minikube/#quickstart, are incorrect

Yes, it seems those docs are incorrect.

utkuozdemir · 2018-11-27T15:02:01Z

I shared the solution that has worked for me here: https://stackoverflow.com/a/53414041/1005102

GOOD21 · 2018-12-04T10:22:40Z

@chrisohaver Is there anyway to disable the loop when I init the k8s?
such as some configuration for "kubeadm init".

csuxh · 2019-06-04T07:01:14Z

Hi Guys,
I removed the loop but still get the same error, how can I solve this:
E0604 06:56:14.691993 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:315: Failed to list *v1.Endpoints: Get https://10.254.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: x509: certificate is valid for 127.0.0.1, 10.211.55.20, 10.211.55.21, 10.211.55.22, not 10.254.0.1
E0604 06:56:14.743608 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:320: Failed to list *v1.Namespace: Get https://10.254.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: x509: certificate is valid for 127.0.0.1, 10.211.55.20, 10.211.55.21, 10.211.55.22, not 10.254.0.1

csuxh · 2019-06-04T07:04:24Z

The kubectl describe is like this:
Normal Scheduled 8m16s default-scheduler Successfully assigned kube-system/coredns-784f8f5b7b-c9nc7 to kube-node3
Warning Unhealthy 6m32s (x5 over 7m12s) kubelet, kube-node3 Liveness probe failed: HTTP probe failed with statuscode: 503
Normal Killing 6m32s kubelet, kube-node3 Container coredns failed liveness probe, will be restarted
Normal Pulled 6m2s (x2 over 8m15s) kubelet, kube-node3 Container image "coredns/coredns:1.1.3" already present on machine
Normal Created 6m2s (x2 over 8m15s) kubelet, kube-node3 Created container coredns
Normal Started 6m1s (x2 over 8m14s) kubelet, kube-node3 Started container coredns
Warning Unhealthy 3m7s (x31 over 8m7s) kubelet, kube-node3 Readiness probe failed: HTTP probe failed with statuscode: 503

mritd · 2019-06-04T07:51:35Z

@csuxh This does not seem to be a problem with coredns. The reason for this problem is that the certificate used by your API Server does not contain the IP of 10.254.0.1.

Ramane19 · 2019-06-17T17:42:28Z

I had this same issue after deleting the loop.

Can someone help me with this?

kubectl logs coredns-fb8b8dccf-j6mjl -n kube-system
Error from server (BadRequest): container "coredns" in pod "coredns-fb8b8dccf-j6mjl" is waiting to start: ContainerCreating
master@master:~$ sudo kubectl get pods --all-namespacesNAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-fb8b8dccf-j6mjl 0/1 ContainerCreating 0 7m31s
kube-system coredns-fb8b8dccf-lst4v 0/1 ContainerCreating 0 7m31s
kube-system etcd-master.testcluster.com 1/1 Running 0 25m
kube-system kube-apiserver-master.testcluster.com 1/1 Running

chrisohaver · 2019-06-17T17:57:42Z

@Ramane19, your pods are stuck in "ContainerCreating", which is a different issue.

mritd closed this as completed Sep 6, 2018

corbot bot added question plugin/loop labels Sep 8, 2018

bhack mentioned this issue Sep 29, 2018

DNS not working inside minikube pods since 23.6 kubernetes/minikube#2302

Closed

chrisohaver mentioned this issue Oct 22, 2018

plugin/loop: add ipv6 loopback address to example #2223

Merged

lentil1016 mentioned this issue Oct 23, 2018

Loop check issue in kubernetes #2229

Closed

xmudrii mentioned this issue Nov 14, 2018

Disable CoreDNS feature gate kubermatic/kubeone#14

Merged

2 tasks

bwillcox mentioned this issue Nov 19, 2018

Issue with k8s.io/docs/setup/minikube/ kubernetes/website#11095

Closed

xmudrii mentioned this issue Nov 26, 2018

Figure out why machine-controller is not spawning nodes kubermatic/kubeone#35

Closed

xmudrii mentioned this issue Nov 28, 2018

Investigate DNS issues on Kubernetes 1.12 clusters kubermatic/kubeone#56

Closed

4 tasks

neolit123 mentioned this issue Dec 3, 2018

coredns CrashLoopBackOff due to dnsmasq kubernetes/kubeadm#1292

Closed

tlkh mentioned this issue Jan 8, 2019

none on Ubuntu should automatically set --extra-config=kubelet.resolv-conf kubernetes/minikube#3511

Closed

mydockergit mentioned this issue Jan 23, 2019

CoreDNS not started with k8s 1.11 and weave (CentOS 7) kubernetes/kubeadm#998

Closed

srikiz mentioned this issue Feb 11, 2019

CoreDNS CrashLoopBackoff with kubeadm on Ubuntu 18.04 projectcalico/calico#2256

Closed

bcg62 mentioned this issue Feb 21, 2019

use --resolv-conf= for kubelet rather than mounting /etc/resolv.conf poseidon/typhoon#367

Closed

moltar mentioned this issue Mar 30, 2019

coredns issues on v1.13 bsycorp/kind#19

Closed

andrewrynhard mentioned this issue Apr 26, 2019

CoreDNS crashes in docker compose siderolabs/talos#575

Closed

coredns locked as resolved and limited conversation to collaborators Jun 18, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Plugin [loop] not work with systemd-resolved running #2087

Plugin [loop] not work with systemd-resolved running #2087

mritd commented Sep 6, 2018

chrisohaver commented Sep 6, 2018

mritd commented Sep 6, 2018

miekg commented Sep 8, 2018 via email

avaikararkin commented Sep 28, 2018 •

edited

Asisranjan commented Oct 4, 2018

chrisohaver commented Oct 4, 2018

johnbelamaric commented Oct 4, 2018 via email

miekg commented Oct 4, 2018 via email

avaikararkin commented Oct 8, 2018

chrisohaver commented Oct 12, 2018

johnbelamaric commented Oct 12, 2018

ahalimkara commented Oct 20, 2018

miekg commented Oct 20, 2018 via email

spitfire88 commented Oct 23, 2018 •

edited

chrisohaver commented Oct 23, 2018

chrisohaver commented Oct 23, 2018 •

edited

zhuziying commented Nov 7, 2018

chrisohaver commented Nov 7, 2018

zhuziying commented Nov 8, 2018

SiddheshRane commented Nov 12, 2018 •

edited

chrisohaver commented Nov 12, 2018

bwillcox commented Nov 19, 2018

chrisohaver commented Nov 19, 2018

bwillcox commented Nov 19, 2018

chrisohaver commented Nov 19, 2018

utkuozdemir commented Nov 27, 2018

GOOD21 commented Dec 4, 2018

csuxh commented Jun 4, 2019 •

edited

csuxh commented Jun 4, 2019

mritd commented Jun 4, 2019

Ramane19 commented Jun 17, 2019

chrisohaver commented Jun 17, 2019

Plugin [loop] not work with systemd-resolved running #2087

Plugin [loop] not work with systemd-resolved running #2087

Comments

mritd commented Sep 6, 2018

chrisohaver commented Sep 6, 2018

mritd commented Sep 6, 2018

miekg commented Sep 8, 2018 via email

avaikararkin commented Sep 28, 2018 • edited

Generated by NetworkManager

Asisranjan commented Oct 4, 2018

chrisohaver commented Oct 4, 2018

johnbelamaric commented Oct 4, 2018 via email

miekg commented Oct 4, 2018 via email

avaikararkin commented Oct 8, 2018

chrisohaver commented Oct 12, 2018

johnbelamaric commented Oct 12, 2018

ahalimkara commented Oct 20, 2018

miekg commented Oct 20, 2018 via email

spitfire88 commented Oct 23, 2018 • edited

chrisohaver commented Oct 23, 2018

chrisohaver commented Oct 23, 2018 • edited

zhuziying commented Nov 7, 2018

chrisohaver commented Nov 7, 2018

zhuziying commented Nov 8, 2018

SiddheshRane commented Nov 12, 2018 • edited

chrisohaver commented Nov 12, 2018

bwillcox commented Nov 19, 2018

chrisohaver commented Nov 19, 2018

bwillcox commented Nov 19, 2018

chrisohaver commented Nov 19, 2018

utkuozdemir commented Nov 27, 2018

GOOD21 commented Dec 4, 2018

csuxh commented Jun 4, 2019 • edited

csuxh commented Jun 4, 2019

mritd commented Jun 4, 2019

Ramane19 commented Jun 17, 2019

chrisohaver commented Jun 17, 2019

avaikararkin commented Sep 28, 2018 •

edited

spitfire88 commented Oct 23, 2018 •

edited

chrisohaver commented Oct 23, 2018 •

edited

SiddheshRane commented Nov 12, 2018 •

edited

csuxh commented Jun 4, 2019 •

edited