Skip to content

Initial HBONE for sidecars and ingress (for master) #41391

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Oct 19, 2022

Conversation

howardjohn
Copy link
Member

@howardjohn howardjohn commented Oct 12, 2022

This PR introduces the initial support for HBONE transport for sidecars and gateways

To enable this (will get easier in the future): --set values.pilot.env.PILOT_ENABLE_INBOUND_PASSTHROUGH=false --set values.pilot.env.PILOT_ENABLE_HBONE=true --set meshConfig.proxyConfig.proxyMetadata.ISTIO_META_ENABLE_HBONE=true (might need YAML format though, or the true/false might get confused as bools instead of strings)

Design doc: https://docs.google.com/document/d/1Ofqtxqzk-c_wn0EgAXjaJXDHB9KhDuLe-W3YGG67Y8g/edit

Below describes the design:

A new label will be added, networking.istio.io/tunnel: http. This will be added to all pods during injection. The reason for this is to say "this pod supports tunnelling". We always inject this, but we know this means it supports tunnelling because we only add this logic to the injector on versions that do support tunel. This mirrors the tlsMode for auto-mtls. The tlsMode label cannot be used directly, since we want legacy pods to use old mTLS style (see below).

A new port 15008 is added, which has no iptables and a listener associated which terminates the HBONE (over mTLS)


PeerAuthentication semantics are essentially unchanged; they do not impact HBONE:

  • DISABLE: port 15006 only accepts plaintext; port 15008 unchanged
  • PERMISSIVE: port 15006 accepts either; port 15008 unchanged
  • STRICT: port 15006 accepts either; port 15008 unchanged

There is currently no planned way to say "only accept HBONE traffic; HBONE and mTLS are considered equivalent from a security perspective".
Note: in a distant future, there would be no standard mTLS at all; STRICT would entirely close the 15006 port, PERMISSIVE opens it, and DISABLE means nothing.

Auto mTLS (no DR configured):

  • If there is a tunnel label, we will use a tunnel
  • Else if there is a tlsMode label, we will use mTLS
  • Else, we will use plaintext

Legacy clients will behave the same except never use tunnel (since they do not understand it)


DestinationRule.tls semantics:

  • DISABLE: plaintext (over tunnel, if tunnel supported)
  • MUTUAL, SIMPLE: TLS (over tunnel, if tunnel supported)
  • ISTIO_MUTUAL: Tunnel or mTLS (tunnel preferred if supported)
    Essentially, we have decoupled the transport (tunnel) from the application TLS. However, in the case of ISTIO_MUTUAL (which would not exist if we created the API with tunnel in mind), we don't want to do both, so we pick one.

Transport will be determined using transport_socket_match. If the upstream supports the tunnel, we add the tunnel metadata which directs to our "initiate tunnel" internal listener; we match on this metadata in the transport_socket_match. We keep our tlsMode matcher and plaintext fallback (where appropriate, see above config), but with tunnel > tls > plaintext priority.

On the inbound, we introduce a new listener on port 15008. This terminates TLS and HBONE. The route matcher has a variety of matches with suffix match for :<port> on :authority. and routes to an internal listener inbound|port. Each inbound|port will consist of the same filter chains as part of the 15006 listener, but:

  • There is only 1/2 filter chains (2 if sniffing)
  • No TLS or transport_socket match
  • No destination port match (already known by the listener)
    And forward to the same inbound cluster. Note: the inbound cluster today is broken since its ORIGINAL_DST. As a workaround we can set PILOT_ENABLE_INBOUND_PASSTHROUGH=false, but long term we need to be able to pass orig_dst through the internal listener

TODO in followups:

  • Add proper testing in CI, and fix issues. The general cases work today, but a few edge cases need work
  • Ensure Telemetry works properly (and test it, per ^)
  • Longer term - optimize HBONE xDS and runtime to avoid internal listener overhead

HBONE is not yet ready for production. It is opt-in behind a feature flag for now.

@howardjohn howardjohn added the release-notes-none Indicates a PR that does not require release notes. label Oct 12, 2022
@howardjohn howardjohn requested review from a team as code owners October 12, 2022 18:34
@istio-testing istio-testing added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Oct 12, 2022
Copy link
Contributor

@stevenctl stevenctl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

didn't get a chance to look all the way through, will look more thoroughly tomorrow morning

ConnectTimeout: durationpb.New(2 * time.Second),
CleanupInterval: durationpb.New(60 * time.Second),
LbConfig: &cluster.Cluster_OriginalDstLbConfig_{
OriginalDstLbConfig: &cluster.Cluster_OriginalDstLbConfig{UseHttpHeader: true},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When will we move off of x-envoy-original-dst-host to some standard HBONE header name?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The header doesn't actually need to be on the write, this is a hack to pass the original dst through the internal listener. I feel like we already solve that with a custom filter... did we?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah the filter is set_dst_address or similar. It looks at the tunnel metadata

Copy link
Contributor

@stevenctl stevenctl Oct 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

func InternalListenerSetAddressFilter() *listener.ListenerFilter {

If the internal listener has this, you can get rid of the x-envoy header in the listener and the cluster's OriginalDstLbConfig can be default.

The filters are all in master for istio/proxy.

https://github.com/istio/proxy/tree/master/src/envoy/set_internal_dst_address

@hzxuzhonghu
Copy link
Member

is this for ambient mesh?

@stevenctl
Copy link
Contributor

is this for ambient mesh?

Not entirely. I think there are some docs written a while ago that explain more general benefits HBONE provides, and John probably elaborate. It is required for sidecar/ambient interop though.

One example usecase is multi-network. Instead of using SNI and getting less predictable LB, we can use HBONE and the client could pick an IP that's not reachable in its network and start an HBONE tunnel to the gateway to ask to be forwarded to a specific pod.

@hzxuzhonghu
Copy link
Member

Great use case, really excited it can solve the multi-network sni drawbacks.

@kyessenov
Copy link
Contributor

kyessenov commented Oct 14, 2022

H2 CONNECT is perfectly ready for production use. Can we separate support to be more specific? There's little value of doing HTTP in HTTP, for example, but a good reason to replace ALPN TLS hackery with HTTP/1.1 CONNECT (not h2 CONNECT).

@@ -235,6 +235,9 @@ data:
metadata:
labels:
security.istio.io/tlsMode: {{ index .ObjectMeta.Labels `security.istio.io/tlsMode` | default "istio" | quote }}
{{- if eq (index .ProxyConfig.ProxyMetadata "ISTIO_META_ENABLE_HBONE") "true" }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really dislike the term HBONE. What this really means is accepting mTLS HTTP CONNECT (only)? Why restrict to h2? h2 is objectively worse than HTTP/1.1 for some cases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you'd like to change the protocol we use I suggest proposing in another design doc. I don't necessarily disagree with you, but this PR is implementing the design which has been established for ~3 years and gone through plenty of discussion and naming bike shedding; this PR isn't a good place to try to change that.

Copy link
Contributor

@kyessenov kyessenov Oct 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'm not calling to change the PR and design docs. This is a purely internal annotation. HBONE doesn't tell anything about HTTP/TCP/UDP or h2 vs HTTP/1.1 etc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And it shouldn't tell that - there are well defined ways to identify h2 vs h1 ( ALPN ) and include metadata (headers, labels).

What I would like is an option (env variable?) to skip the auto-hbone and just do exclusive hbone and nothing else.

@kyessenov
Copy link
Contributor

I think we need the following granularity of support:

  • enable acceptance of HTTP mTLS, both HTTP/1.1 and h2 (for both TCP and HTTP, TCP should not use an internal listener for performance reasons); this must carry a warning that it breaks NetworkPolicy in k8s.
  • enable routing for TCP via http/1.1 instead of direct port; this solves ALPN mess and should have zero performance impact;
  • enable routing for all traffic.

@howardjohn
Copy link
Member Author

I think we need the following granularity of support:

In general it doesn't hurt to have more granular flags, so I am not against it. But do you have more context on where you think this would be useful? I was trying to think where someone may want to configure to accept HBONE but not send it and didn't come up with any

@kyessenov
Copy link
Contributor

configure to accept HBONE but not send it and didn't come up with any

Well, the main reason for using standard protocols (and not something custom, like HBONE makes it sound like), is inter-op. gRPC or any mesh-less endpoints should be able to communicate with an Envoy sidecar. I don't understand the bi-directional question, usually you respond on the same channel, so it's always the same protocol.

@howardjohn
Copy link
Member Author

I am not sure I follow. I thought your comment was saying "make 2 options, one to send HBONE and one to accept HBONE". My question is when would someone set one without the other?

Note that we already have autodetection of support, so "send HBONE" really means "send HBONE if its supported by the destination".

@kyessenov
Copy link
Contributor

I don't believe sending HTTP in HTTP CONNECT has zero performance implications, and I'd like to see performance evaluation to be convinced otherwise. So yes, of course, you'd want not to send it, yet accept it for inter-op.

@howardjohn
Copy link
Member Author

I was thinking if you don't think HBONE is performant/ready, you wouldn't want to accept it either since it will still impact your performance, just of inbound calls vs outbound calls. But maybe some users only care about outbound performance

@kyessenov
Copy link
Contributor

@howardjohn The problem is that if you want to interoperate with other meshes, they will never use the made-up ALPN/MX, etc protocols. So it's a necessity to accept it from a separate Ambient or gRPC mesh, for example. It doesn't mean you have to use when you know the endpoints have Istio. The reason for granularity is that we can predict that TCP over HTTP/1.1 CONNECT will not degrade performance for Istio, so that can be enabled by default. The same cannot be said for HTTP traffic.

@hzxuzhonghu
Copy link
Member

Can anyone of you link the design docs, i think guys not from google does not know much about it

@costinm
Copy link
Contributor

costinm commented Oct 15, 2022

What we call HBONE is really the protocol that browsers and apps are using, and I don't think there is any reason to not support HTTP/1.1 CONNECT as well, but at a later point. I don't think it's an exclusive choice - it is great for a node to support a set of standard protocols, and interoperate with as many other standards as we can.

But we do need one 'baseline' protocol that all nodes should support - and I think HTTP/2 and 3 based CONNECT remains the right choice for standardization and interop. Just like browsers and servers support multiple protocols, so can we - but the first step is to support one protocol. Adding HTTP/1 CONNECT using standard ALPN should probably be the next step.
The main reason for choosing a standard protocol is interop with other infrastructure and hopefully other mesh implementations.

My concern is a bit different: assuming we have a well-tuned and minimal ztunnel implementation, wouldn't be simpler to run it as an extra process or container alongside standard envoy sidecar, and let it take care of all HBone and future ambient protocols ? The ztunnel process would handle whatever is needed for security and handle it to envoy as TCP.
Of course the main problem is how do we pass the cert info and meta to envoy - I don't think the current PROXY
support in envoy is sufficient but it could be extended. I wrote a proposal some time back about 'trust delegation', and
supporting a model where mTLS is handled by an ingress or other middle box - this would require some changes in
envoy and our auth.

Note that I'm not against merging this PR - it is important to make progress and provides interop, but long term I would
replace it with a simpler model. In terms of perf, my tests don't show a big difference between forwarding (in particular
over UDS) and current envoy internal listeners.

@costinm
Copy link
Contributor

costinm commented Oct 15, 2022

I think we need the following granularity of support:

  • enable acceptance of HTTP mTLS, both HTTP/1.1 and h2 (for both TCP and HTTP, TCP should not use an internal listener for performance reasons); this must carry a warning that it breaks NetworkPolicy in k8s.

Eventually we may accept H1 and even others, but I don't think this is a requirement for now and certainly not for this PR, and we should have one baseline we test with and we can standardize on.

NetworkPolicy is a different story - agree, we need a warning until we support it, and I think we must support it (i.e.
ztunnel authz should process it).

  • enable routing for TCP via http/1.1 instead of direct port; this solves ALPN mess and should have zero performance impact;

What ALPN mess ? Sorry, missed that - I thought the HBONE port has standard h2 alpn.

When we add support for HTTP/1.1 CONNECT - and I hope we do it, but not before we have have
a solid release - it should also work just like h/1 and h2 are expected to negotiate.

  • enable routing for all traffic.

Not sure what that means, but the current 'auto' model seems to be good enough and easy for the first iteration.
We may add other options in the future.

@costinm
Copy link
Contributor

costinm commented Oct 15, 2022

H2 CONNECT is perfectly ready for production use. Can we separate support to be more specific? There's little value of doing HTTP in HTTP, for example, but a good reason to replace ALPN TLS hackery with HTTP/1.1 CONNECT (not h2 CONNECT).

I agree on the HTTP over HTTP double framing - and I think at some point in the future we will need to have a way to
avoid it ( in particular for proxyless gRPC), but I also think we can start with treating everything as TCP, and extend
only after things are stable. Same about H1 CONNECT, I really want that too - but as a future option.

If you could explain more on the ALPN 'hackery' - I didn't see the code doing that.

For 'production use' - I think there are whitepapers and docs about the extensive use of this kind of protocol in google prod,
for example BeyondCorp.

// Generally, this could be a per-proxy setting (and is, via ENABLE_HBONE node metadata).
// However, there are some code paths that impact all clients, hence the global flag.
// Warning: do not enable by default until endpoint_builder.go caching is fixed (and possibly other locations).
EnableHBONE = env.Register(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I follow - why do we need 2 settings to enable ? One injection setting ( using META_..) or a label should be enough.

I think it would be ideal to enable this from the workload side, not centrally.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was my original plan but I added this because the endpoint_builder part requires disabling a critical cache in order to support HBONE; this cache is not workload-bound.

I would hope we find a way around that issue and remove this setting shortly after

Copy link
Contributor

@costinm costinm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, few cosmetic comments that can be fixed in separate PRs.

Most of the testdata seems to be boilerplate - is there any golden data file with hbone enabled ? Would be easier to review the output.

@@ -159,6 +159,12 @@ const (
k8sSeparator = "."
)

const (
TunnelLabel = "networking.istio.io/tunnel"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this will not be popular - but how about 'hbone' as the label ? It is not
really an Istio protocol - and should not be perceived as an istio protocol.

Using 'tunnel' or 'h2' are way too broad.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need a domain for k8s best practices. Maybe we should buy hbone.io...

Also, somewhat related. I was thinking instead of tunnel=h2... we should have it be future proof

In the future we may want http/1.1 support; we probably don't need to say tunnel=h2,h1 since we can negotiate over ALPN. So no need to change the label there.. but h2 is also probably a bad name.

But if we add HTTP/3, we cannot use ALPN negotiation. We can use alt-svc but seems wasteful when we could know ahead of time (and also ~broken to ever use http/2 for tunneling UDP), we likely we need a way to say tunnel=h2,h3.

So basically:

  • Label should support multiple CSV values (like ALPN somewhat)
  • h2 is probably a bad name, but I am not sure a better one.
  • We can switch the base of the label but it ought to be a domain we own.
    WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with @costinm . This needs to be a declaration that a TCP port is open for mTLS and let ALPN decide the codec. We'll need a separate declaration for QUIC port. And another declaration that regular ports are closed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about the last one - it's not our concern if NetworkPolicy or some other setting is closing regular ports.

What we want to know is:

  • if a pod is using ztunnel
  • if a pod is using sidecar with ztunnel support

Forget h2 - once TLS handshake happen we can negotiate what protocol we use.

Since ztunnel should not implement the old protocol - maybe it doesn't even matter to indicate a sidecar with
ztunnel support. By the time we roll ztunnel, we can assume 1-2 versions of istio are done and all
sidecars accept ztunnel.

@@ -1633,3 +1639,49 @@ func removeListenerFilterTimeout(listeners []*listener.Listener) {
func listenerKey(bind string, port int) string {
return bind + ":" + strconv.Itoa(port)
}

const baggageFormat = "k8s.cluster.name=%s,k8s.namespace.name=%s,k8s.%s.name=%s,service.name=%s,service.version=%s"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about that... Maybe make it a flag ? We need to discuss.

I would not make it k8s specific, and avoid redundancy - namespace is part of the cert. Is the service.name the 'canonical service' of the caller ? Because the caller does know the service already.

I would expect we'll need some config similar to access log format, but not in this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not k8s specific, labels are not canonical. Agree about further compressing, and also this is far from complete. There is no mesh ID, and pod-specific metadata.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the 'k8s' name in the label is pretty k8s specific. I don't think we should send redundant data - we don't support multiple meshes, so the mesh id should be the same in all workloads. If we do support - we can add an option.

We should send the minimum amount of data that is required by the user - there are costs not only in sending and parsing, but also in constructing too many timeseries.

Certainly not a constant...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. @costinm can you file an issue on what can be removed? The list is here https://github.com/istio/proxy/blob/master/extensions/common/metadata_object.h#L101. Apart from namespace, there isn't much to improve. App name/version is not the same as canonical name/version. Cluster name, mesh ID, and trust domain in SA are all distinct.

@kyessenov
Copy link
Contributor

kyessenov commented Oct 15, 2022

Eventually we may accept H1 and even others, but I don't think this is a requirement for now and certainly not for this PR, and we should have one baseline we test with and we can standardize on.

Envoy will auto detect codec so it's weird to have h2 as an upgrade token, when http/1.1 is accepted. HTTP/1.1 is preferred in a sidecar model when connections are already pooled. We can also avoid using internal listeners and all framing costs of h2 altogether for HTTP/1.1 probably, since it's just like the PROXY protocol.

  • enable routing for TCP via http/1.1 instead of direct port; this solves ALPN mess and should have zero performance impact;

By ALPN mess I'm referring to istio-peer-exchange and istio-http1.1 tokens that Istio uses as long as the matrix of the combinations. We need to stop that sooner, and the easiest way is to send HTTP/1.1 CONNECT for all TCP by default.

When we add support for HTTP/1.1 CONNECT - and I hope we do it, but not before we have have a solid release - it should also work just like h/1 and h2 are expected to negotiate.

  • enable routing for all traffic.

Not sure what that means, but the current 'auto' model seems to be good enough and easy for the first iteration. We may add other options in the future.

Auto means it's always preferred. I don't think that's what users want for all traffic until we can prove otherwise with performance data. Internal listeners add 10% CPU overhead plus 1MB buffer cache per connection. So sending to the CONNECT port should be opt-in. But it's completely safe to just accept CONNECT by default (modulo network policy). It's not safe to upgrade users to this model.

@costinm
Copy link
Contributor

costinm commented Oct 16, 2022

Eventually we may accept H1 and even others, but I don't think this is a requirement for now and certainly not for this PR, and we should have one baseline we test with and we can standardize on.

Envoy will auto detect codec so it's weird to have h2 as an upgrade token, when http/1.1 is accepted. HTTP/1.1 is preferred in a sidecar model when connections are already pooled. We can also avoid using internal listeners and all framing costs of h2 altogether for HTTP/1.1 probably, since it's just like the PROXY protocol.

That's all internal envoy logic - I think the primary concern here is to have a future proof, interoperable protocol, that everyone
can support. Very few load balancers accept http/1.1 CONNECT, and I think http/2 and 3 are more future proof anyways.

My point is that I don't disagree to eventually support H1 or other protocols - but we need a baseline.

  • enable routing for TCP via http/1.1 instead of direct port; this solves ALPN mess and should have zero performance impact;

By ALPN mess I'm referring to istio-peer-exchange and istio-http1.1 tokens that Istio uses as long as the matrix of the combinations. We need to stop that sooner, and the easiest way is to send HTTP/1.1 CONNECT for all TCP by default.

When we add support for HTTP/1.1 CONNECT - and I hope we do it, but not before we have have a solid release - it should also work just like h/1 and h2 are expected to negotiate.

  • enable routing for all traffic.

Not sure what that means, but the current 'auto' model seems to be good enough and easy for the first iteration. We may add other options in the future.

Auto means it's always preferred. I don't think that's what users want for all traffic until we can prove otherwise with performance data. Internal listeners add 10% CPU overhead plus 1MB buffer cache per connection. So sending to the CONNECT port should be opt-in. But it's completely safe to just accept CONNECT by default (modulo network policy). It's not safe to upgrade users to this model.

I see what you mean. Yes, we may want a slower opt-in.

John, maybe we could identify if the dest is a ztunnel use CONNECT, otherwise keep using old istio protocol ?
And accept in all cases.

@kyessenov
Copy link
Contributor

That's all internal envoy logic - I think the primary concern here is to have a future proof, interoperable protocol, that everyone can support. Very few load balancers accept http/1.1 CONNECT, and I think http/2 and 3 are more future proof anyways.

Are you saying h2 CONNECT is more widely deployed than HTTP/1.1 CONNECT? That can't be true. Moreover, when connections are already pooled (regardless of envoy or not, it just needs HTTP awareness) and there is no sharing of tunnels (most sidecars), HTTP/1.1 may be the recommended version anyways (until quic is ready).

@kyessenov
Copy link
Contributor

kyessenov commented Oct 17, 2022

That's all internal envoy logic - I think the primary concern here is to have a future proof, interoperable protocol, that everyone can support. Very few load balancers accept http/1.1 CONNECT, and I think http/2 and 3 are more future proof anyways.

Are you saying h2 CONNECT is more widely deployed than HTTP/1.1 CONNECT? That can't be true. Moreover, when connections are already pooled (regardless of envoy or not, it just needs HTTP awareness) and there is no sharing of tunnels (most sidecars), HTTP/1.1 may be the recommended version anyways (until quic is ready).

No, I am saying that for tunneling raw data, H2 is more likely to be supported. Maybe not CONNECT - but POST in some cases ( envoy supports this and we'll likely need as well ).

Also:

  • H2/H3 are the future, I wouldn't use H1 as a base/standard protocol
  • I don't think this PR is the proper place to revisit the core protocol.

I don't think anyone objects to adding additional protocol support - just on timing.

That's exactly my point, don't hard-code h2. Allow negotiation, and permit the use of HTTP/1.1 CONNECT for raw TCP when a server wants that, and permit the use of POST method (which seems out of standard of CONNECT-* actually). Why impose all the framing costs and dangers of TCP-on-TCP retry loops and flow control management on everyone?

QUIC is separate already since it's on UDP and there's no http4 yet to negotiate. That's truly the future, but there are few proxies that support all of QUIC.

@kyessenov
Copy link
Contributor

Forgot to mention: HTTP/1.1 CONNECT version can be implemented in the kernel (with eBPF). h2 spec is too complicated to properly implement as an extension. So by forcing h2 on the server side we're eliminating potential eBPF clients.

@costinm
Copy link
Contributor

costinm commented Oct 17, 2022

That's all internal envoy logic - I think the primary concern here is to have a future proof, interoperable protocol, that everyone can support. Very few load balancers accept http/1.1 CONNECT, and I think http/2 and 3 are more future proof anyways.

Are you saying h2 CONNECT is more widely deployed than HTTP/1.1 CONNECT? That can't be true. Moreover, when connections are already pooled (regardless of envoy or not, it just needs HTTP awareness) and there is no sharing of tunnels (most sidecars), HTTP/1.1 may be the recommended version anyways (until quic is ready).

No, I am saying that for tunneling raw data, H2 is more likely to be supported. Maybe not CONNECT - but POST in some cases ( envoy supports this and we'll likely need as well ).
Also:

  • H2/H3 are the future, I wouldn't use H1 as a base/standard protocol
  • I don't think this PR is the proper place to revisit the core protocol.

I don't think anyone objects to adding additional protocol support - just on timing.

That's exactly my point, don't hard-code h2. Allow negotiation, and permit the use of HTTP/1.1 CONNECT for raw TCP when a server wants that, and permit the use of POST method (which seems out of standard of CONNECT-* actually). Why impose all the framing costs and dangers of TCP-on-TCP retry loops and flow control management on everyone?

I agree the label should not use 'h2' - I think John was also looking for alternative names, and ALPN can be used
to negotiate an alternative protocol - and as I mentioned I also agree we should support H1/CONNECT, H2/POST
and other alternatives, in future.

Flow control for stream in H2/H3 is pretty good - and again the main point is that H2/H3 are better protocols with
a lot of infrastructure, so a better long-term bet than H1.

We also know this works - google is using this extensively.

Note that CONNECT is not as simple as it sounds - it is really CONNECT over TLS, we wouldn't send the
headers in plain text. And that preclude 'splice' and likely a lot of other optimizations, and it is not really what
standard HTTP/1.1 CONNECT would do - that's a major flaw in the protocol, now Firefox/Chrome are also
supporting H2/CONNECT with the outer connection encrypted. I am looking at using CONNECT in secure
networks ( where low-level network provides security ) - but that is a narrow case.

Future design docs and optimizations will certainly happen - but I think we need to focus on the H2.

QUIC is separate already since it's on UDP and there's no http4 yet to negotiate. That's truly the future, but there are few proxies that support all of QUIC.

Maybe. UDP packets tend to be dropped more by some routers, and NAT and other things are a bit different.
We'll certainly use it at some point.

@costinm
Copy link
Contributor

costinm commented Oct 17, 2022

Forgot to mention: HTTP/1.1 CONNECT version can be implemented in the kernel (with eBPF). h2 spec is too complicated to properly implement as an extension. So by forcing h2 on the server side we're eliminating potential eBPF clients.

Kernel and eBPF are not magic - they have their proper use, not sure doing mTLS in eBPF is what we should be
doing. CONNECT is the easy part - crypto, cert verification, etc are far more complicated. And we are not 'forcing'
or eliminating anything - having H2 as the core protocol doesn't mean it's the only protocol we'll ever support.

@kyessenov
Copy link
Contributor

kyessenov commented Oct 17, 2022

Ok, I'm purely arguing for standards conformance. HBONE is not h2, it should be allowing any HTTP version. It's fine to pick one as a recommendation (based on experimental data), but since specification doesn't mandate anything, we should accept any HTTP version on the server side, and initiate any version if we want to inter-op with older infrastructure. Envoy has a thing with "codec invariance", meaning any HTTP feature is uniformly supported across all HTTP versions, so there's no technical reason why we would not allow all.

QUIC will be required for UDP. Networks that have UDP traffic can properly route QUIC already.

@costinm
Copy link
Contributor

costinm commented Oct 17, 2022 via email

@howardjohn howardjohn force-pushed the hbone/sidecars-master branch from 72b6af4 to 3e2bb44 Compare October 17, 2022 19:35
@howardjohn
Copy link
Member Author

In the most recent commit and removed the h2 label and made it future proofed and documented. I don't want to implement HTTP/1.1 or other changes as part of the initial implementation; we can do them as followups though.

@@ -235,6 +235,9 @@ data:
metadata:
labels:
security.istio.io/tlsMode: {{ index .ObjectMeta.Labels `security.istio.io/tlsMode` | default "istio" | quote }}
{{- if eq (index .ProxyConfig.ProxyMetadata "ISTIO_META_ENABLE_HBONE") "true" }}
networking.istio.io/tunnel: {{ index .ObjectMeta.Labels `networking.istio.io/tunnel` | default "h2" | quote }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we mentioned this earlier, but this should not have "istio" in the name. The protocol is not istio-specific, and we should try to communicate that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree but don't have any better ideas. It needs to be a domain we own per-k8s guidelines I think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to be very hard to change. So we need to publish the right one first time.

@costinm
Copy link
Contributor

costinm commented Oct 18, 2022 via email

@costinm
Copy link
Contributor

costinm commented Oct 18, 2022 via email

@howardjohn
Copy link
Member Author

I talked with Louis and Justin about this, there was some consensus to stick with istio.io for now and change it if needed in the future.

@kyessenov
Copy link
Contributor

From description: networking.istio.io/tunnel: h2.
Why are we still using h2 as the value? I thought we agreed to delegate to ALPN for codec selection. Wouldn't networking.istio.io/tunnel: hbone be more appropriate? Then we can add other values, qbone, proxy (unlikely).

@howardjohn
Copy link
Member Author

Description is outdated, ill update

@howardjohn
Copy link
Member Author

Also, as discussed with @costinm offline and as suggested in this PR, will do a followup to split "accept hbone" and "send hbone" enablement

Copy link
Contributor

@costinm costinm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving the PR - we can address improvements and changes in follow-ups.

@howardjohn
Copy link
Member Author

/retest

@istio-testing istio-testing merged commit 9f6de15 into istio:master Oct 19, 2022
@howardjohn
Copy link
Member Author

/cherrypick release-1.16

@istio-testing
Copy link
Collaborator

@howardjohn: new pull request created: #41519

In response to this:

/cherrypick release-1.16

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@howardjohn howardjohn mentioned this pull request Oct 21, 2022
25 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-notes-none Indicates a PR that does not require release notes. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants