Skip to content

Document the 127.0.0.6 magic #29603

Closed
Closed
@yanniszark

Description

@yanniszark

Bug description

This is more of a question than a bug. We (Arrikto) noticed the following behavior in Istio:
incoming connections from declared service ports are proxied through 127.0.0.1 but requests through undeclared ports are proxied through 127.0.0.6. In the code, I see this refers to something called InboundPassthroughClusterIpv4. @dntosas then found this interesting comment: https://github.com/istio/istio/pull/15906/files#r308491044

So it seems like some "magic" was necessary, but I don't understand why. Possible reasons we could think of are so that the upstream app knows if the downstream connection is through a declared or undeclared port. Or maybe it has nothing to do with that. cc'ing @howardjohn @lambdai based on the discussion in the review comments.

[x] Docs
[ ] Installation
[ ] Networking
[ ] Performance and Scalability
[ ] Extensions and Telemetry
[ ] Security
[ ] Test and Release
[ ] User Experience
[ ] Developer Infrastructure
[ ] Upgrade

Steps to reproduce the bug

  1. Setup a server and client Pod with a sidecar.
  2. Create a k8s service and declare port 9090 in the server.
  3. Start a netcat server in the server pod on port 9090
  4. Connect from client and see that the server sidecar binds to 127.0.0.1 for the sidecar<->netcat socket.
  5. Start a netcat server in the server pod on port 9091 (undeclared).
  6. Connect from client and see that the server sidecar binds to 127.0.0.6 for the sidecar<->netcat socket.

Version (include the output of istioctl version --remote and kubectl version --short and helm version if you used Helm)

Istio:

client version: 1.5.7                                                                                                                                                                                                                           cluster-local-gateway version:                                                                                                                                                                                                                  ingressgateway version: 1.5.7                                                                                                                                                                                                                   pilot version: 1.5.7                                                                                                                                                                                                                            data plane version: 1.5.7 (16 proxies) 

Kubernetes:

Client Version: v1.16.4
Server Version: v1.16.15

How was Istio installed?

profile.yaml

Environment where bug was observed (cloud vendor, OS, etc)

Minikube

Activity

lambdai

lambdai commented on Dec 16, 2020

@lambdai
Contributor

Thank you for the description!

Due to the traffic capture, the ultimate tcp connection to the service application is established by the proxy(envoy), so the peer ip seen by the service application can never be the client pod ip anyway.

Currently the only way to obtain the original client pod ip is to bypass the capture.
I can add the document somewhere in the wiki or FAQ

self-assigned this
on Dec 16, 2020
yanniszark

yanniszark commented on Dec 16, 2020

@yanniszark
Author

@lambdai thanks for your answer! But I don't think it explains the reason for choosing that address.

Due to the traffic capture, the ultimate tcp connection to the service application is established by the proxy(envoy), so the peer ip seen by the service application can never be the client pod ip anyway.

Agreed, because the TCP traffic is proxied, the server can't see the client's original address. But WHY choose 127.0.0.6? And why use 127.0.0.6 when proxying an undeclared port, but 127.0.0.1 when proxying a declared port?

To reproduce what I'm saying:

  1. Start a client and server Pod, along with a K8s service targetting the server port 8081.
  2. In the server Pod, start a netcat server on port 8081: nc -vlp 8081
  3. In the client Pod, start connect to <server_ip>:8081.
  4. In the server Pod, print the connections: ss -tunap. You'll see that Envoy binds to 127.0.0.1 for the sidecar<->server connection.
  5. Now, in the server Pod, start a netcat server on port 8082 (undeclared): nc -vlp 8082
  6. In the client Pod, start connect to <server_ip>:8082.
  7. In the server Pod, print the connections: ss -tunap. You'll see that Envoy binds to 127.0.0.6 for the sidecar<->server connection.

Why does this distinction exist? In the linked PR, @howardjohn claimed there was some complex logic at play but it seems it was never documented. @howardjohn perhaps you recall what it was?

lambdai

lambdai commented on Dec 16, 2020

@lambdai
Contributor

Agreed, because the TCP traffic is proxied, the server can't see the client's original address. But WHY choose 127.0.0.6? And why use 127.0.0.6 when proxying an undeclared port, but 127.0.0.1 when proxying a declared port?

127.0.0.6 is chosen since it is a legit and free local ipv4 address in linux ipv4 stack. And .6 slightly matches 15006. I admit there might be alternatives.

$ ip route show table local |grep 127.0.0.0
local 127.0.0.0/8 dev lo proto kernel scope host src 127.0.0.1 

why use 127.0.0.6 when proxying an undeclared port, but 127.0.0.1 when proxying a declared port?

Great experiment! The short answer is config simplifity.

There is some traffic that could lead to an infinite loop and there is traffic that won't cause an infinite loop.

The former traffic is the traffic hitting passthrough filter chain, which istio does not define service on that pod port with pod ip address. The 127.0.0.6 is used to mark the traffic as "inbound traffic" and will never hit outbound iptables rules.

The latter traffic which hits k8s service target port has the dest address 127.0.0.1 and won't cause an infinite loop. Binding 127.0.0.6 doesn't bring benefits so 127.0.0.1 is automatically chosen by the linux kernel. Theoretically, we can bind 127.0.0.6 as well at the cost of another syscall and some additional envoy config.

yanniszark

yanniszark commented on Dec 18, 2020

@yanniszark
Author

@lambdai thanks, I think I almost got it, but there is an important piece of the puzzle missing. The address that Envoy proxies the tcp connection to.

  • For declared ports, Envoy proxies the connection to 127.0.0.1. Envoy (server-sidecar) binds to 127.0.0.1 and the server binds to 127.0.0.1 as well.
  • For undeclared ports, Envoy proxies the connection to the Pod IP! Envoy (server-sidecar) binds to 127.0.0.6 and the server binds to <pod_ip>.

I understand that the 2nd case would be indistinguishable from other cases (e.g., process in pod trying to talk to server in the same pod through a Service ClusterIP), so Envoy has to bind to a different port in order to recognize and exclude this traffic in IPTables.

So why this distinction in the destination IP? Why proxy to the <pod_ip> and not to localhost for undeclared ports? Here is my guess:

  • By proxying the undeclared ports to the <pod_ip>, Istio avoids exposing undeclared ports (servers) that only listen to localhost. So it preserves the secure localhost network.

But then my question is: why does Istio require services to bind to localhost or 0.0.0.0 for proxying them (https://istio.io/latest/docs/ops/deployment/requirements/#application-bind-address)? Couldn't Istio proxy traffic for a declared port whose server binds to <pod_ip> by using the 127.0.0.6 trick and treating all cases the same?

added a commit that references this issue on Mar 11, 2021
added
lifecycle/staleIndicates a PR or issue hasn't been manipulated by an Istio team member for a while
on Mar 17, 2021
yanniszark

yanniszark commented on Mar 25, 2021

@yanniszark
Author

Ping @lambdai. I know it's been some time since this discussion and a lot of things are out of my mental cache (and yours as well probably). I'd love to be able to revisit this sometime soon.

lambdai

lambdai commented on Mar 25, 2021

@lambdai
Contributor

But then my question is: why does Istio require services to bind to localhost or 0.0.0.0 for proxying them (https://istio.io/latest/docs/ops/deployment/requirements/#application-bind-address)? Couldn't Istio proxy traffic for a declared port whose server binds to <pod_ip> by using the 127.0.0.6 trick and treating all cases the same?

I think that is old security practise: istio use only 127.0.0.1 as endpoint so if your service listens on 127.0.0.1 and you are protected by istio sidecar proxy. Listen on 0.0.0.0 is fine when you want both istio sidecar proxy access and istio sidecar proxy bypass.

This design is to address istio sidecar proxy access to pod ip.
https://docs.google.com/document/d/1j-5_XpeMTnT9mV_8dbSOeU7rfH-5YNtN_JJFZ2mmQ_w/edit#heading=h.xw1gqgyqs5b

removed
lifecycle/staleIndicates a PR or issue hasn't been manipulated by an Istio team member for a while
on Mar 25, 2021

8 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Labels

area/networkingkind/docslifecycle/automatically-closedIndicates a PR or issue that has been closed automatically.lifecycle/staleIndicates a PR or issue hasn't been manipulated by an Istio team member for a while

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

    Development

    No branches or pull requests

      Participants

      @lambdai@howardjohn@yanniszark@istio-policy-bot

      Issue actions

        Document the 127.0.0.6 magic · Issue #29603 · istio/istio