Skip to content

Upgrading docker 1.13 on nodes causes outbound container traffic to stop working #40182

Closed
@colemickens

Description

@colemickens
Contributor

Kubernetes version (use kubectl version): v1.4.6, v1.5.1, likely many versions

Environment:

  • Cloud provider or hardware configuration: Azure / Azure Container Service
  • OS (e.g. from /etc/os-release): Ubuntu Xenial
  • Kernel (e.g. uname -a): latest 16.04-LTS kernel
  • Install tools: Cloud-Init + hyperkube
  • Others:

Configuration Details:

  • kubelet runs in a container
  • master services run as static manifests
  • kube-addon-manager runs as a static manifest
  • kube-proxy runs in iptables mode via a daemonset

What happened:
After upgrading to docker 1.13.0 on the nodes, outbound container traffic stops working

What you expected to happen:
Outbound container traffic to work (aka, I can hit the internet and service ips from inside the container)

How to reproduce it (as minimally and precisely as possible):
Deploy an ACS Kubernets cluster. If the workaround has rolled out, then force upgrade docker to 1.13 (you'll have to remove a pin we're setting in /etc/apt/preferences.d).

Unclear if this repros on other configurations right now.

Anything else do we need to know:

No, I just don't know where/how to best troubleshoot this.

Activity

added
sig/nodeCategorizes an issue or PR as relevant to SIG Node.
sig/networkCategorizes an issue or PR as relevant to SIG Network.
on Jan 20, 2017
0xmichalis

0xmichalis commented on Jan 20, 2017

@0xmichalis
Contributor

@kubernetes/sig-node-misc

dkerwin

dkerwin commented on Jan 23, 2017

@dkerwin

Can confirm the problem with k8s 1.4.7 & docker 1.13 on debian jessie. kubelet managed by systemd

colemickens

colemickens commented on Jan 24, 2017

@colemickens
ContributorAuthor

Since the team @Kargakis tagged here is no longer a team... cc: @kubernetes/sig-node-bugs

bboreham

bboreham commented on Jan 31, 2017

@bboreham
Contributor

Docker 1.13 changed the default iptables forwarding policy to DROP, which has effects like this.

You can change the policy to ACCEPT (which it was in Docker 1.12 and before) by running:

sudo iptables -P FORWARD ACCEPT

on every node. You need to run this in the host network namespace, not inside a pod namespace.

MaesterZ

MaesterZ commented on Jan 31, 2017

@MaesterZ

sudo iptables -P FORWARD ACCEPT

Tested out and working

Environment:

  • Cloud provider: AWS
  • OS: Ubuntu Server 16.04 LTS
  • Kernel: 4.4.0-59
  • Kubernetes: 1.5.2
  • Docker: 1.13
  • Network plugin: Weave 1.8.2
  • Network: VPC with NAT gateway, internet gateway, public & private subnets
feiskyer

feiskyer commented on Feb 1, 2017

@feiskyer
Member

Docker 1.13 changed the default iptables forwarding policy to DROP, which has effects like this.

Could someone explain why docker defaulting to DROP? Does this mean containers of docker v1.13 can't connect outside by default?

bboreham

bboreham commented on Feb 1, 2017

@bboreham
Contributor

@feiskyer generally the Linux default is to have IP forwarding off.
Docker used to turn it on across the board, which was (a) unnecessary and (b) a security issue. 1.13 removed this issue.

Docker add two specific rules which allow traffic off their bridge, and replies to come back:

-A FORWARD -i docker0 ! -o docker0 -j ACCEPT
-A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT

CNI providers which do not use the docker0 bridge need to make similar provision.

bboreham

bboreham commented on Feb 1, 2017

@bboreham
Contributor

@colemickens can you clarify which network plugin you are using - is it kubenet?

Karimerto

Karimerto commented on Feb 1, 2017

@Karimerto

This thread sure was a lifesaver, though sadly I found it about 14 hours too late.. I managed to wipe my entire cluster and reinstall everything, with the same issue still persisting. I was about to lose my mind trying to figure out why half of my original cluster was working and the other wasn't. Those which didn't work were installed and added later with docker 1.13, so this explains everything.

Now I've got everything up and running again!

Thanks again for this 👍

jbeda

jbeda commented on Feb 2, 2017

@jbeda
Contributor

The docker change that caused this: moby/moby#28257

colemickens

colemickens commented on Feb 2, 2017

@colemickens
ContributorAuthor

(@bboreham Yes, it was kubenet.)

80 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/dockerkind/bugCategorizes issue or PR as related to a bug.sig/networkCategorizes an issue or PR as relevant to SIG Network.sig/nodeCategorizes an issue or PR as relevant to SIG Node.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @dims@jbeda@gtaylor@jimmycuadra@dkerwin

        Issue actions

          Upgrading docker 1.13 on nodes causes outbound container traffic to stop working · Issue #40182 · kubernetes/kubernetes