本文介绍了我试图根据Prometheus黑盒导出器的成功响应来计算Grafana中的正常运行时间的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图计算probe_success的数量,然后将其乘以探测间隔,以获取以秒为单位的正常运行时间,并将值类型设置为total.问题是随着时间范围的变化,最小的步长变化不会给我们提供正确的阅读信息并使该选项无效.我们实际上试图做的是根据为仪表板设置的时间范围内的成功探测获得正常运行时间百分比.我们正在使用singlestat来显示百分比

I've tried counting the number of probe_success and multiplying it by the probe interval trying to get uptime in seconds and setting the value type to the total. the issue is the minimum step changes as the time frame changes not giving us a correct reading and nulling this option. What we are actually trying to do is get the percentage uptime based on successful probes in the time frame set for the dashboard. We are using singlestat to show the percentage

(probe_success{instance="www.google.com:443",job="clienttest"})*15

我们尝试将值除以出口商自身的价值,以尝试获得一个百分比,该百分比也将无济于事.

We tried dividing the value by an exporter its self to try to get a percentage that would also scale to no avail.

sum(probe_success{instance="www.google.com:443",job="clienttest"}) / sum(probe_success{instance="self",job="clienttest"}) *100

推荐答案

对于singlestat面板,您要做的只是使用probe_success{instance="www.google.com:443",job="clienttest"}作为表达式,并在选项下确保您使用的是平均值聚合.

For a singlestat panel what you want to do is use just probe_success{instance="www.google.com:443",job="clienttest"} as the expression, and under options make sure you are using the Average aggregation.

在PromQL端,您也可以执行avg_over_time(probe_success[1h]),请参见此博客文章.

On the PromQL side you can also do avg_over_time(probe_success[1h]), see this blog post.

这篇关于我试图根据Prometheus黑盒导出器的成功响应来计算Grafana中的正常运行时间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-17 00:22