Skip to content

cAdvisor should export pod labels for container metrics #32326

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
hanikesn opened this issue Sep 8, 2016 · 42 comments
Closed

cAdvisor should export pod labels for container metrics #32326

hanikesn opened this issue Sep 8, 2016 · 42 comments
Labels
area/kubelet area/monitoring lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. sig/node Categorizes an issue or PR as relevant to SIG Node.

Comments

@hanikesn
Copy link

hanikesn commented Sep 8, 2016

Currently it's cAdvisor doesn't export pod labels for container level metrics. This would be desirable to be able to aggregate container level metrics by application.

As Kubernetes doesn't set pod labels on Docker containers (#25301) cAdvisor can't export those labels to it's /metrics endpoint which makes metric aggregation by pod labels impossible. Kubernetes doesn't set these labels, because currently it's not possible to dynamically set Docker Container Labels (moby/moby#21721). It's also not clear when this situation will change.

I propose a implementing a workaround, so that cAdvisor can directly get the labels from the kublet and export them accordingly.

@k8s-github-robot k8s-github-robot added area/kubelet sig/node Categorizes an issue or PR as relevant to SIG Node. labels Sep 8, 2016
@grobie
Copy link
Contributor

grobie commented Sep 14, 2016

At SoundCloud we work around the problem by using an extra exporter which exports all pod labels in a separate metric and we then join these together in our queries. This is similar to the approach described in @brian-brazil's blog post about machine role labels. Exporting a lot labels per metric makes it more difficult to work with them. So this might even be the more desirable approach.

@fabxc @kubernetes/sig-instrumentation We should discuss this need in our next meeting.

@davidopp
Copy link
Member

I propose a implementing a workaround, so that cAdvisor can directly get the labels from the kublet and export them accordingly.

IIUC this is the approach @vishh has advocated as well, particularly as part of the work for #18770.

@brian-brazil
Copy link

http://www.robustperception.io/exposing-the-software-version-to-prometheus/ is a slightly more relevant version of that blog post.

@jimmidyson
Copy link
Member

@grobie @brian-brazil Although we're using Prometheus' exposition format, we do have to make sure that any decisions we make around labelling don't restrict/negatively affect other consumers. If this was a Prometheus only decision then I would 100% agree with you though.

@fabxc
Copy link
Contributor

fabxc commented Sep 14, 2016

Although we're using Prometheus' exposition format, we do have to make sure that any decisions we make around labelling don't restrict/negatively affect other consumers. If this was a Prometheus only decision then I would 100% agree with you though.

I think its a reasonable step of normalization to constrain labels to the minimum identity-giving set in the exposition. The consumer is free to denormalize again for its own purposes – in Prometheus this happens to be at query time. But other systems can easily do so on write.
Going the other direction is generally harder for the consumer while bloating the exposed metrics.

Of course it's saner if it's happening in the same metric set. But #18770 aims at just that.

@hanikesn
Copy link
Author

@grobie Any chance of open sourcing that?

@vishh
Copy link
Contributor

vishh commented Sep 14, 2016

My idea was that kubelet should expose an extension API that let's
monitoring agents figure out the list of pods on the node along with
detailed runtime information like container & image identifiers, etc.
cAdvisor, being a monitoring agent, can then use this API to come up with
pod level metrics and metadata for k8s pods.

On Wed, Sep 14, 2016 at 1:27 PM, Steffen Hanikel notifications@github.com
wrote:

@grobie https://github.com/grobie Any chance of open sourcing that?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#32326 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AGvIKPZqoTMWEa_8odh1h_EMsVjWw8FZks5qqFiwgaJpZM4J4h6h
.

@tomwilkie
Copy link

tomwilkie commented Dec 29, 2016

@hanikesn @grobie FYI just added support for exporting a pod-label-metric for kube-api-exporter: tomwilkie/kube-api-exporter#9

(edited) I also wrote a blog post on how to do the join in Prometheus - https://www.weave.works/aggregating-pod-resource-cpu-memory-usage-arbitrary-labels-prometheus/

@natalia-k
Copy link

is any chance that this will be implemented in the next release ?

Thanks a lot!

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 21, 2017
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle rotten
/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 20, 2018
@piosz
Copy link
Member

piosz commented Jan 25, 2018

/remove-lifecycle rotten

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Jan 25, 2018
@piosz
Copy link
Member

piosz commented Jan 25, 2018

cc @dchen1107

@Nowaker
Copy link

Nowaker commented Feb 9, 2018

This would be very useful.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 10, 2018
@discostur
Copy link

@hanikesn any news on this?

@foxish
Copy link
Contributor

foxish commented Jun 8, 2018

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 8, 2018
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@beorn7
Copy link

beorn7 commented Sep 7, 2018

Note that prometheus/client_golang doesn't require consistent label dimensions anymore, which might come in handy here. See prometheus/client_golang#417 . Also, you can now have unchecked Collectors in a formally correct way.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 6, 2018
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 5, 2019
@Nowaker
Copy link

Nowaker commented Jan 5, 2019

@brancz Still planned?

@brancz
Copy link
Member

brancz commented Jan 7, 2019

@Nowaker the metrics overhaul has been moved to 1.14 but the pull request is already out to make this change!

@Nowaker
Copy link

Nowaker commented Jan 7, 2019

Yay! Thanks a ton @brancz.

@micahhausler
Copy link
Member

@brancz got a reference for the 1.14 PR?

@brancz
Copy link
Member

brancz commented Feb 5, 2019

The PR is here: #69099

@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@brancz
Copy link
Member

brancz commented Mar 7, 2019

#69099 was merged, so this is landing in 1.14 🎉

@alvaroaleman
Copy link
Member

/reopen

@hangyan @brancz I believe closing this issue via #69099 is a misunderstanding. #69099 is about adding the literal pod label to the cadvisor metrics. This issue is about adding the labels that are on the pod as label to the cadvisor metrics.

@k8s-ci-robot
Copy link
Contributor

@alvaroaleman: Reopened this issue.

In response to this:

/reopen

@hangyan @brancz I believe closing this issue via #69099 is a misunderstanding. #69099 is about adding the literal pod label to the cadvisor metrics. This issue is about adding the labels that are on the pod as label to the cadvisor metrics.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot reopened this May 16, 2019
@brancz
Copy link
Member

brancz commented May 17, 2019

I see. I think that’s metadata that should be joined onto the metric at query time (this data itself is already available in kube state metrics). More importantly labels on a pod can change over a pod’s lifetime which if we added pod labels to these metrics they would be marked stale and produce a new time-series which wouldn’t reflect the reality.

@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@sirkubax
Copy link

I partially solved my case: Needs to have 'deployment' or 'app' label as a filter for pods names
grafana/grafana#28447

But how nice would it be, if cAdvisor could forward an 'app' label along with metrics...

@sirkubax
Copy link

/reopen

please export all Pod labels ('app', 'deployment', etc...)
image

like this
__meta_kubernetes_pod_label_app="schema-registry-ui"

@k8s-ci-robot
Copy link
Contributor

@sirkubax: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen

please export all Pod labels ('app', 'deployment', etc...)
image

like this
__meta_kubernetes_pod_label_app="schema-registry-ui"

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@will-beta
Copy link

Any updates on this?

@woodliu
Copy link

woodliu commented Nov 26, 2021

/reopen
Is there any plan to support add pod label within cadvisor?

@k8s-ci-robot
Copy link
Contributor

@woodliu: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen
Is there any plan to support add pod label within cadvisor?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@shenshouer
Copy link

Hot to ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kubelet area/monitoring lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. sig/node Categorizes an issue or PR as relevant to SIG Node.
Projects
None yet
Development

No branches or pull requests