Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support filtering monitored containers by container label #2380

Open
stevebail opened this issue Jan 19, 2020 · 30 comments
Open

Support filtering monitored containers by container label #2380

stevebail opened this issue Jan 19, 2020 · 30 comments

Comments

@stevebail
Copy link

I am working with kubelet cAdvisor that comes along with kubernetes cluster.
I know that cAdvisor exposes container stats as Prometheus metrics but I am not very familiar on how to retrieve them manually using curl.
What is the command to know 1) if cAdvisor is running on each node, 2) what version and 3) what port it is exposing?
Are the cAdvisor metrics always served on the /metrics endpoint?
Your help is greatly appreciated.

@dashpole
Copy link
Collaborator

:10255/metrics/cadvisor is where you can find them in recent releases.

@stevebail
Copy link
Author

stevebail commented Jan 21, 2020

@dashpole
@juliusv

Thanks David!

I now know 3 different ways to scrape cAdvisor metrics.
See below.
Which option is recommended for current release and future direction?

Option 1) Scrape the API server for each node in the cluster
:--api-server-port--/api/v1/nodes//proxy/metrics/cadvisor
Example: :8443/api/v1/nodes/node01/proxy/metrics/cadvisor

Option 2)
Scrape the kubelet port on each node
:--kubelet-port--/metrics/cadvisor
Example: :10255/metrics/cadvisor

Option 3)
Scrape each cAdvisor pod deployed as a daemonset configured with
:--cadvisor-port--/metrics
Examples: :8080/metrics

@dashpole
Copy link
Collaborator

Option 1 and option 2 are the same endpoint. One is just proxied by the API Server. Prefer (2) when possible because it is more direct. If you want to customize the set of metrics exposed by cAdvisor, you can run it yourself as a daemonset. If you just want the metrics in the kubelet's metrics/cadvisor endpoint, I would just use that to save on resource consumption.

@stevebail
Copy link
Author

stevebail commented Jan 22, 2020

@dashpole I have one specific question about cAdvisor. I noticed that node_explorer supports "collectors" with the ability to enable/disable them at the source. Does cAdvisor supports collectors so that user can select the group of cAdvisor metrics to enable/disable? Do I need to install cAdvisor daemonset for this? Or will Prometheus get all cAdvisor metrics during a pull request? If so, is it possible to filter some of them in Prometheus server?

@dashpole
Copy link
Collaborator

You can use the --disable_metrics flag to specify the set of metrics you don't want.

@stevebail
Copy link
Author

  1. What is command to query the cAdvisor runtime flags?
  2. I see the following options for --disable_metrics: 'disk', 'network', 'tcp', 'udp', 'sched', 'process’. Are any of those flags controlling the cAdvisor Prometheus metrics? I guess we get all cAdvisor Prometheus metrics by default?

@dashpole
Copy link
Collaborator

  1. It should be ./cadvisor --help
  2. All of those flags control cAdvisor prometheus metrics. You get only metrics that are not expensive by default. Most of the disabled metrics have a large number of metric streams for each container.

@stevebail
Copy link
Author

I am at root directory on the node and cadvisor file cannot be found:

node01 $ /.cadvisor --help
-bash: /.cadvisor: No such file or directory

@dashpole
Copy link
Collaborator

yeah, you will need to run that on the cAdvisor binary, which most likely isn't in the root directory of your node. Try:

docker run google/cadvisor --help from anywhere you have docker.

@stevebail
Copy link
Author

The k8s cluster is already running and I see kubelet is reachable on port 10250:

node01 $ netstat -plant | grep kubelet
tcp 0 0 127.0.0.1:38235 0.0.0.0:* LISTEN 1369/kubelet
tcp 0 0 127.0.0.1:10248 0.0.0.0:* LISTEN 1369/kubelet
tcp 0 0 172.17.0.42:53624 172.17.0.24:6443 ESTABLISHED 1369/kubelet
tcp 0 0 172.17.0.42:53676 172.17.0.24:6443 ESTABLISHED 1369/kubelet
tcp6 0 0 :::10250 :::* LISTEN 1369/kubelet

Do you think I still need to install cAdvisor binary on the node?

@dashpole
Copy link
Collaborator

See this comment above. You only need to run cAdvisor seperately if you need to customize the set of metrics. Also, I would not recommend manually installing it. I would use a DaemonSet instead, and use the docker image.

@stevebail
Copy link
Author

Got it. Sorry for taking so much of your time.
Last Q.
If I don't want to customize metrics but I only want to see the current cAdvisor runtime flags in my k8s cluster environment running with kubelet on each node, how to do this?

@dashpole
Copy link
Collaborator

In my cluster, the command line flags for the kubelet are stored in /etc/default/kubelet, but that may change based your setup...

@stevebail
Copy link
Author

@dashpole
I have a use case where I just want to collect my container metrics and I don't have cluster admin rights. For instance. I am a cluster user and I just need cAdvisor metrics for containers in my namespace. So far cAdvisor scrapping options that I know of imply I have access to the node IP addresses or have sufficient RBAC privileges to scrape the cAdvisor endpoint via the API server. What are my options if RBAC is limited to my namespace and I just want cAdvisor metrics for containers in my namespace?

@dashpole
Copy link
Collaborator

We don't really support that use-case today. If you run cAdvisor as a daemonset, you would need the pod to be privileged for host filesystem access anyways, so there isn't really a good way to use it without elevated privileges.

@stevebail
Copy link
Author

@dashpole
Hi David
I am thinking to request an enhancement to cAdvisor to support the ability to only collect container stats for containers that are in the same namespace as the cAdvisor deamonSet. I think it makes sense to support such use case since a user may be only interested in its container stats. Ok to proceed? How should we proceed?

@dashpole
Copy link
Collaborator

How would cAdvisor know the namespace of the container?

@stevebail
Copy link
Author

I don't think it is very difficult but I am no expert. You tell me :)
I think the automatic version of discovering the monitored namespace is not so easy (?).
What about starting with a manual approach where the namespace is provided though configuration (say an an argument) of the cAdvisor container...

@dashpole
Copy link
Collaborator

namespace is a kubernetes construct. cAdvisor doesn't "understand" kubernetes constructs. Say we want to only collect metrics for containers in namespace foo. This is how it currently works:

  1. cAdvisor discovers the cgroup with id 5498743594325698432u85342k
  2. cAdvisor queries the container runtime for 5498743594325698432u85342k, and gets the container name, image, etc. The namespace is not included, since the container runtime doesn't know about kubernetes namespaces.
  3. ?

@stevebail
Copy link
Author

In the docker/container runtime I see the following data:
"io.kubernetes.pod.namespace": "foo"

@dashpole
Copy link
Collaborator

That is a container label. We probably don't want to rely on that, as it isn't an actual API. We could potentially have label filtering (e.g. only collect metrics for containers where label foo=bar).

@stevebail
Copy link
Author

That would be great David!

@dashpole dashpole changed the title How to query cAdvisor metrics Support filtering monitored containers by container label Mar 18, 2020
@mariadb-zdraganov
Copy link

Is there actual support for passing match[] parameters to the /metrics endpoint?

@stevebail
Copy link
Author

I don't think it currently does and this would be part of this proposed enhancement.
You also mean the /metrics/cadvisor endpoint (e.g. cAdvisor in kubelet).
This is for @dashpole to clarify.

@dashpole
Copy link
Collaborator

cAdvisor in the kubelet has its own labeling for cAdvisor metrics: https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/server/server.go#L959

I think you should be able to use the store container labels and label whitelist flags here:
https://github.com/google/cadvisor/blob/master/cmd/cadvisor.go#L71

@stevebail
Copy link
Author

@dashpole
Hi David.
Is it possible to get an update on the filtering feature request i.e. the ability to collect cAdvisor metrics based on container label whitelist?

@celian-garcia
Copy link

celian-garcia commented Aug 18, 2021

@stevebail Did you try to whitelist on the prometheus scrape job ?
I'm doing it with success :

  - job_name: cadvisor
    scrape_interval: 5s
    static_configs:
    - targets:
      - cadvisor:8080
    metric_relabel_configs:
    - source_labels: [ container_label_prometheus_io_scrape ]
      regex: True
      action: keep

Knowing that my whitelisted containers have the following label

labels:
  prometheus.io/scrape: true

@stevebail
Copy link
Author

@celian-garcia
Thank you for the suggestion.
I am looking for a way to keep a container metric for certain containers and filter out the same metric from unwanted containers.

@celian-garcia
Copy link

Yeah I did the suggestion mainly for people like me who want to filter containers by label having the hand on the Prometheus configuration. If it is not your case, the solution won't fit your need.

I still think that the feature is worth it in cAdvisor.

@zdraganov
Copy link

@stevebail Did you try to whitelist on the prometheus scrape job ? I'm doing it with success :

  - job_name: cadvisor
    scrape_interval: 5s
    static_configs:
    - targets:
      - cadvisor:8080
    metric_relabel_configs:
    - source_labels: [ container_label_prometheus_io_scrape ]
      regex: True
      action: keep

Knowing that my whitelisted containers have the following label

labels:
  prometheus.io/scrape: true

The issue with this configuration is that the filtering is done in Prometheus, not query time, which can lead to significant bigger memory usage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants