Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add proxy for container streaming in kubelet for streaming auth. #64006

Merged
merged 4 commits into from Jun 1, 2018

Conversation

Random-Liu
Copy link
Member

@Random-Liu Random-Liu commented May 18, 2018

For #36666, option 2 of #36666 (comment).

This PR:

  1. Removed the DirectStreamingRuntime, and changed IndirectStreamingRuntime to StreamingRuntime. All DirectStreamingRuntimes, dockertools and rkt, were removed.
  2. Proxy container streaming in kubelet instead of returning redirect to apiserver. This solves the container runtime authentication issue, which is what we agreed on in [CRI] Add authentication to streaming server #36666.

Please note that, this PR replaced the redirect with proxy directly instead of adding a knob to switch between the 2 behaviors. For existing CRI runtimes like containerd and cri-o, they should change to serve container streaming on localhost, so as to make the whole container streaming connection secure.

If a general authentication mechanism proposed in #62747 is ready, we can switch back to redirect, and all code can be found in github history.

Please also note that this added some overhead in kubelet when there are container streaming connections. However, the actual bottleneck is in the apiserver anyway, because it does proxy for all container streaming happens in the cluster. So it seems fine to get security and simplicity with this overhead. @derekwaynecarr @mrunalp Are you ok with this? Or do you prefer a knob?

@yujuhong @timstclair @dchen1107 @mikebrow @feiskyer
/cc @kubernetes/sig-node-pr-reviews
Release note:

Kubelet now proxies container streaming between apiserver and container runtime. The connection between kubelet and apiserver is authenticated. Container runtime should change streaming server to serve on localhost, to make the connection between kubelet and container runtime local.

In this way, the whole container streaming connection is secure. To switch back to the old behavior, set `--redirect-container-streaming=true` flag.

@Random-Liu Random-Liu added this to the v1.11 milestone May 18, 2018
@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/node Categorizes an issue or PR as relevant to SIG Node. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels May 18, 2018
@Random-Liu Random-Liu changed the title Streaming auth Add proxy for container streaming in kubelet for streaming auth. May 18, 2018
@Random-Liu Random-Liu force-pushed the streaming-auth branch 2 times, most recently from f1d3c57 to c59bff0 Compare May 18, 2018 01:54
@Random-Liu Random-Liu added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. kind/feature Categorizes issue or PR as related to a new feature. labels May 18, 2018
@Random-Liu
Copy link
Member Author

With this PR, crictl also works with dockershim now:

$ sudo /usr/local/bin/crictl --runtime-endpoint=unix:///var/run/dockershim.sock exec -i -t 1fe29f12bc656 /bin/sh
# ls
ls
bin  build  cgi-bin  conf  error  htdocs  icons  include  logs	modules
#

Copy link
Member

@mikebrow mikebrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments / questions.

s.host.StreamingConnectionIdleTimeout(),
remotecommandconsts.DefaultStreamCreationTimeout,
remotecommandconsts.SupportedStreamingProtocols)
handler := proxy.NewUpgradeAwareHandler(url, nil, false, false, &responder{})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// retrive handler for url location using default proxy transport(nil) with wrapTransport and upgradeRequired parameters set to off (false)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add comment for each field.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

s.host.StreamingConnectionIdleTimeout(),
remotecommandconsts.DefaultStreamCreationTimeout,
remotecommandconsts.SupportedStreamingProtocols)
handler := proxy.NewUpgradeAwareHandler(url, nil, false, false, &responder{})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should responder be a proxy.ErrorResponder or nil here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

proxy.ErrorResponder is just an interface. responder implements it.

return err
}
// Use the actual address as baseURL host. This handles the "0" port case.
s.config.BaseURL.Host = listener.Addr().String()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should probably comment the baseurl structure element on line 74 above so people know it will/may be updated after start() otherwise someone might rval it early...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}

// IndirectStreamingRuntime is the interface implemented by runtimes that handle the serving of the
// StreamingRuntime is the interface implemented by runtimes that handle the serving of the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should make a comment here about DirectStreamingRuntime and IndirectStreamingRuntime being end of life at least for the time being. Maybe with a link to the pr that removed them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is just kubelet implementation detail. I don't think anyone will come back and cleanup if we put such a comment here.

Like when we remove rkt and dockershim package, we won't leave a comment saying dockershim was removed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True.. I saw the upper case first char.. and since we may have a future implementation with a switch, I figured a link to the past. But as you say the link is in github.

@Random-Liu
Copy link
Member Author

I'll fix the unit test.

@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels May 18, 2018
@Random-Liu
Copy link
Member Author

Random-Liu commented May 18, 2018

Fixed the unit test. The unit test was testing DirectStreamingRuntime, which is not supported anymore.
This PR changed the test to test against the proxy mode instead.

@Random-Liu
Copy link
Member Author

/test containerd

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 1, 2018
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Random-Liu, yujuhong

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@tallclair
Copy link
Member

/sig auth

@k8s-ci-robot k8s-ci-robot added the sig/auth Categorizes an issue or PR as relevant to SIG Auth. label Jun 1, 2018
@k8s-github-robot
Copy link

[MILESTONENOTIFIER] Milestone Pull Request: Up-to-date for process

@Random-Liu @tallclair @yujuhong

Pull Request Labels
  • sig/auth sig/cluster-lifecycle sig/node: Pull Request will be escalated to these SIGs if needed.
  • priority/important-soon: Escalate to the pull request owners and SIG owner; move out of milestone after several unsuccessful escalation attempts.
  • kind/feature: New functionality.
Help

// and apiserver will access container runtime directly. This approach is more performant,
// but less secure because the connection between apiserver and container runtime is not
// authenticated.
RedirectContainerStreaming bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a FeatureGate instead?

Copy link
Member Author

@Random-Liu Random-Liu Jun 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more an internal refactoring than a "feature".

We didn't even want to add the flag initially, because we don't want user to care about this. But eventually we decided to do so because we don't have performance data yet, and we want to leave a knob for people who care about the overhead.

If the data in #64006 (comment) makes sense, I even want to remove the flag. :P

@@ -77,6 +86,7 @@ func (s *ContainerRuntimeOptions) AddFlags(fs *pflag.FlagSet) {
// General settings.
fs.StringVar(&s.ContainerRuntime, "container-runtime", s.ContainerRuntime, "The container runtime to use. Possible values: 'docker', 'remote', 'rkt (deprecated)'.")
fs.StringVar(&s.RuntimeCgroups, "runtime-cgroups", s.RuntimeCgroups, "Optional absolute name of cgroups to create and run the runtime in.")
fs.BoolVar(&s.RedirectContainerStreaming, "redirect-container-streaming", s.RedirectContainerStreaming, "Enables container streaming redirect. If false, kubelet will proxy container streaming data between apiserver and container runtime; if true, kubelet will return an http redirect to apiserver, and apiserver will access container runtime directly. The proxy approach is more secure, but introduces some overhead. The redirect approach is more performant, but less secure because the connection between apiserver and container runtime is not authenticated.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "not authenticated" -> "may not be authenticated"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}()
go func() {
if err := ds.streamingServer.Start(true); err != nil && err != http.ErrServerClosed {
glog.Fatalf("Failed to start streaming server: %v", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: server may have started, maybe "Streaming server stopped unexpectedly: %v"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -77,6 +86,7 @@ func (s *ContainerRuntimeOptions) AddFlags(fs *pflag.FlagSet) {
// General settings.
fs.StringVar(&s.ContainerRuntime, "container-runtime", s.ContainerRuntime, "The container runtime to use. Possible values: 'docker', 'remote', 'rkt (deprecated)'.")
fs.StringVar(&s.RuntimeCgroups, "runtime-cgroups", s.RuntimeCgroups, "Optional absolute name of cgroups to create and run the runtime in.")
fs.BoolVar(&s.RedirectContainerStreaming, "redirect-container-streaming", s.RedirectContainerStreaming, "Enables container streaming redirect. If false, kubelet will proxy container streaming data between apiserver and container runtime; if true, kubelet will return an http redirect to apiserver, and apiserver will access container runtime directly. The proxy approach is more secure, but introduces some overhead. The redirect approach is more performant, but less secure because the connection between apiserver and container runtime is not authenticated.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Should this be proxy-container-streaming so the false value maintains the previous behavior?

Copy link
Member Author

@Random-Liu Random-Liu Jun 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want proxy to be the default behavior, so that:

  1. Container streaming of all CRI container runtime is more secure.
  2. dockershim can work with crictl (it is not the main goal, but is something we do want)

if kubeDeps.TLSOptions != nil {
config.TLSConfig = kubeDeps.TLSOptions.Config
if !crOptions.RedirectContainerStreaming {
config.Addr = net.JoinHostPort("localhost", "0")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this use a unix socket instead of localhost? (more secure)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can add a TODO.

I decided to go with localhost first because it is simpler. For unix socket, we need to parse URL returned by container runtime to figure out whether it is unix socket (container runtime could still return regular URL), and handle unix socket differently for NewUpgradeAwareHandler.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow up is fine, but I'm just curious - did you try this? I don't see anything in NewUpgradeAwareHandler that looks like it would have a problem with a unix URL?

s.host.StreamingConnectionIdleTimeout(),
remotecommandconsts.DefaultStreamCreationTimeout,
remotecommandconsts.SupportedStreamingProtocols)
handler := proxy.NewUpgradeAwareHandler(url, nil /*transport*/, false /*wrapTransport*/, false /*upgradeRequired*/, &responder{})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set MaxBytesPerSec, or at least put a TODO

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

remotecommandconsts.DefaultStreamCreationTimeout,
remotecommandconsts.SupportedStreamingProtocols)
handler := proxy.NewUpgradeAwareHandler(url, nil /*transport*/, false /*wrapTransport*/, false /*upgradeRequired*/, &responder{})
handler.ServeHTTP(response.ResponseWriter, request.Request)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: refactor this to function, so we can be sure to use the same options for exec, attach & pf

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

s.host.StreamingConnectionIdleTimeout(),
remotecommandconsts.DefaultStreamCreationTimeout,
remotecommandconsts.SupportedStreamingProtocols)
handler := proxy.NewUpgradeAwareHandler(url, nil /*transport*/, false /*wrapTransport*/, false /*upgradeRequired*/, &responder{})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the apiserver sets upgradeRequired for exec, attach & PF requests, so I think we should do the same.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -233,10 +235,16 @@ func (s *server) Start(stayUp bool) error {
return errors.New("stayUp=false is not yet implemented")
}

listener, err := net.Listen("tcp", s.config.Addr)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned elsewhere, can we provide a "unix" option?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented Jun 1, 2018

@Random-Liu: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-node-e2e-containerd a9fe427319d0f01aefe9dca97943cb445f402bf8 link /test pull-kubernetes-node-e2e-containerd
pull-kubernetes-e2e-containerd-gce a9fe427319d0f01aefe9dca97943cb445f402bf8 link /test pull-kubernetes-e2e-containerd-gce
pull-kubernetes-kubemark-e2e-gce-big 746c32d link /test pull-kubernetes-kubemark-e2e-gce-big

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel comment for consistent failures.

@k8s-github-robot
Copy link

Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here.

@k8s-github-robot k8s-github-robot merged commit 8d10a8f into kubernetes:master Jun 1, 2018
@Random-Liu Random-Liu deleted the streaming-auth branch June 1, 2018 07:23
@Random-Liu
Copy link
Member Author

@tallclair The PR was merged... I'll address your comments in a new PR.

@Random-Liu
Copy link
Member Author

Random-Liu commented Jun 1, 2018

I did some simple performance benchmark in a local cluster today.

Benchmark:

  1. Create a local cluster with/without --redirect-container-streaming flag;
  2. Create a busybox pod;
  3. Exec into the busybox pod and run while true; do echo a; done;
  4. Use top to monitor CPU and memory usage of kubelet and apiserver (cadvisor port is disabled now, I decided to use top...)

Result:

kubelet cpu (cores) kubelet rss (mb) apiserver cpu (cores) apiserver rss (mb)
proxy 1.6 134.7 0.65 402.6
redirect 1.16 134.9 0.87 432.3
  • It seems that with proxy, kubelet has higher cpu usage, because of the extra hop, but no much memory usage difference.
  • But with redirect, apiserver has higher cpu and memory usage. I don't know the reason yet, but probably because kubelet proxy helps throttle the stream (?).

The benchmark is very simple. Please feel free to reproduce this.

Anyway, this is the data I get. And if it turns out that redirect has higher apiserver resource usage, I prefer proxy more because apiserver is the actual bottleneck.

@yujuhong @tallclair @mrunalp @feiskyer
/cc @kubernetes/sig-node-bugs

@Random-Liu
Copy link
Member Author

Random-Liu commented Jun 1, 2018

Here is the result if I run 2 exec at the same time:

kubelet cpu (cores) kubelet rss (mb) apiserver cpu (cores) apiserver rss (mb)
proxy 1.7 137.8 0.6 391.7
redirect 1.2 138.4 0.98 428.8

Apiserver still have more source usage in the redirect approach. Interesting...

Random-Liu added a commit to Random-Liu/kubernetes that referenced this pull request Jun 2, 2018
Signed-off-by: Lantao Liu <lantaol@google.com>
k8s-github-robot pushed a commit that referenced this pull request Jun 3, 2018
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Address comments in #64006.

Address comments in #64006 

@tallclair @yujuhong 
@kubernetes/sig-node-pr-reviews 
Signed-off-by: Lantao Liu <lantaol@google.com>

**Release note**:

```release-note
none
```
wenjiaswe pushed a commit to wenjiaswe/kubernetes that referenced this pull request Jun 19, 2018
Signed-off-by: Lantao Liu <lantaol@google.com>
@What871As
Copy link

[plugins.cri]
stream_server_address = "127.0.0.1"
stream_server_port = "0"
enable_tls_streaming = false

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubeadm cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/node Categorizes an issue or PR as relevant to SIG Node. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants