本文介绍了如何为在Prometheus上的Kubernetes集群上运行的Pod查找有关CPU/MEM的指标的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我通过Terraform的Helm进行了Prometheus设置,并且已配置为连接到我的Kubernetes集群.我打开了Prometheus,但不确定从列表中选择哪个指标以查看正在运行的Pod/作业的CPU/MEM.以下是所有使用命令运行的Pod( test1 是kube 命名空间):

  kubectl -n test1获取容器 

将日历设置为上周五没有结果:

4月20日更新屏幕我试图选择开始日期为4月17日(星期六)的2天,但没有看到任何结果:

并且,如果我删除(namespace ="jobs")条件,也看不到任何结果:

我刚才再次尝试重新运行该作业(模拟作业),并尝试在该作业仍处于运行模式时执行prometheus查询,但未得到任何结果:-(在这里,您可以看到我的作业正在运行.

我没有任何结果:

使用简单过滤器时,只需 container_cpu_usage_seconds_total ,我可以看到namespace ="jobs"

解决方案

node_cpu_seconds_total 是来自 node-exporter 的度量标准,node-exporter 带来了计算机统计信息及其度量标准.前缀为 node _ .您需要来自 cAdvisor 的指标,该指标产生与容器相关的指标,并且以 container _ 为前缀:

  container_cpu_usage_seconds_totalcontainer_cpu_load_average_10scontainer_memory_usage_bytescontainer_memory_rss 

以下是一些基本查询供您入门.准备好可能需要进行调整(您可以使用不同的标签名称):

每个Pod的CPU使用率

  sum(irate(container_cpu_usage_seconds_total {container!="POD",container =〜.+"} [2m]))由(pod) 

每个Pod的RAM使用量

  sum((pod)sum(container_memory_usage_bytes {container!="POD",container =〜.+"}) 

每个舱的进/出流量率

请注意,使用 host 网络模式(未隔离)的pod会显示整个节点的流量速率. * 8 是为了方便起见将字节转换为位(MBit/s,GBit/s等).

 #传入sum(irate(container_network_receive_bytes_total [2m]))由(pod)* 8#外向sum(irate(container_network_transmit_bytes_total [2m]))由(pod)* 8 

I have Prometheus setup via Helm from Terraform and it's is configured to connect to my Kubernetes cluster. I open my Prometheus but I am not sure which metric to choose from the list to be able to view the CPU/MEM of running pods/jobs.Here are all the pods running with the command (test1 is the kube namespace):

kubectl -n test1 get pods

podsrunning

When, I am on Prometheus, I see many metrics related to CPU, but not sure which one to choose:

prom1

I tried to choose one, but the namespace = prometheus and it uses prometheus-node-exporter and I don't see my cluster or my namespace test1 anywhere here.

prom2

Could you please help me? Thank you very much in advance.

UPDATE SCREENSHOTUPDATE SCREENSHOTI need to concentrate on this specific namespace, normally with the command:kubectl get pods --all-namespaces | grep hermatwinI see the first line with namespace = jobs I think this is namespace.

No result when set calendar to last Friday:

UPDATE SCREENSHOT April 20I tried to select 2 days with starting date on last Saturday 17 April but I don't see any result:

ANd, if I remove (namespace="jobs") condition, I don't see any result either:

I tried to rerun the job (simulation jobs) again just now and tried to execute the prometheus query while the job was still running mode but I don't get any result :-( Here you can see my jobs where running.

I don't get any result:

When using simple filter, just container_cpu_usage_seconds_total, I can see the namespace="jobs"

解决方案

node_cpu_seconds_total is a metric from node-exporter, the exporter that brings machine statistics and its metrics are prefixed with node_. You need metrics from cAdvisor, this one produces metrics related to containers and they are prefixed with container_:

container_cpu_usage_seconds_total
container_cpu_load_average_10s
container_memory_usage_bytes
container_memory_rss

Here are some basic queries for you to get started. Be ready that they may require tweaking (you may have different label names):

CPU Utilisation Per Pod

sum(irate(container_cpu_usage_seconds_total{container!="POD", container=~".+"}[2m])) by (pod)

RAM Usage Per Pod

sum(container_memory_usage_bytes{container!="POD", container=~".+"}) by (pod)

In/Out Traffic Rate Per Pod

Beware that pods with host network mode (not isolated) show traffic rate for the whole node. * 8 is to convert bytes to bits for convenience (MBit/s, GBit/s, etc).

# incoming
sum(irate(container_network_receive_bytes_total[2m])) by (pod) * 8
# outgoing
sum(irate(container_network_transmit_bytes_total[2m])) by (pod) * 8

这篇关于如何为在Prometheus上的Kubernetes集群上运行的Pod查找有关CPU/MEM的指标的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

06-19 11:00