Prometheus¶
This simple pack (based on a community pack) does 2 things:
It gives the means to query prometheus
It gives a webhook receiver directly usable by AlertManager.
Actions¶
- prometheus.query
Executes an instant query using Prometheus query API
Requires a
query
parameters representing the query to execute.Optionnaly, it can use the
url
parameter to change the URL where Prometheus API is reachable. This default to theurl
parameter of the pack’s configuration.# st2 run prometheus.query query="ALERTS{}" id: 5ea17fcd049f2e425784c8b6 status: succeeded parameters: query: ALERTS{} url: http://monitor1.mg1.hpc.domain.fr/prometheus result: exit_code: 0 result: data: result: - metric: __name__: ALERTS alertname: Infiniband DF+ All-to-All missing link alertstate: firing cluster: ppi desc: i38r2isw3 remote_desc: i46r2isw3 service: Infiniband severity: warning value: - 1587642318.073 - '1' resultType: vector status: success stderr: '' stdout: ''
- prometheus.series
List available time series using Prometheus series API.
Requires a
queries
parameter representing the URL parameters as specified in Prometheus’s documentation# st2 run prometheus.series queries="match[]=up" id: 5ea18029049f2e425784c8b9 status: succeeded parameters: queries: match[]: up url: http://monitor1.mg1.hpc.domain.fr/prometheus result: exit_code: 0 result: data: - __name__: up cluster: ppi fabric: hdr-compute instance: irene245:9199 job: infiniband_compute service: infiniband [...] stderr: '' stdout: ''
Rules¶
The pack is shipped with an example rule on how to setup a webhook sensor on StackStorm and how to configure it in Prometheus.
This rule uses the core.st2.webhook trigger type which creates on-demand generic sensor. This kind of rule allow an external program to post some arbitrary data at StackStorm webhook endpoint (https://HOST/api/v1/webhook/NAME
, where NAME
is the url
parameter of the st2.core.webhook)
Thus, the following receiver can be added to AlertManager :
# /etc/alertmanager/alertmanager.yaml
[...]
receivers:
- name: admins
email_configs:
- to: admins@mg1.hpc.domain.fr
send_resolved: true
html: '{{ template "email.html" . }}'
headers:
Subject: '{{ template "ppi_subject" . }}'
webhook_configs:
- send_resolved: true
url: https://auto1.mg1.hpc.domain.fr/api/v1/webhooks/prometheus_webhook
[...]
Which will emits trigger like the following :
# st2 trigger-instance get 5ea167c8049f2e427f080b0c -y
id: 5ea167c8049f2e427f080b0c
occurrence_time: '2020-04-23T10:02:48.000000Z'
payload:
body:
[...]
headers:
Accept: '*/*'
Content-Length: '14'
Content-Type: application/json
Host: auto1.mg1.hpc.domain.fr,auto1.mg1.hpc.domain.fr
User-Agent: curl/7.29.0
X-Forwarded-For: 127.0.0.1
X-Real-Ip: 127.0.0.1
X-Request-Id: 766098ce-f8e6-4a21-9204-df63bbf3c1bb
status: processed
trigger: core.6023c805-8b30-4bf6-9fbc-e4f34a850a47
Where the body of the request is a JSON in the following format:
{
"version": "4",
"groupKey": <string>, // key identifying the group of alerts (e.g. to deduplicate)
"status": "<resolved|firing>",
"receiver": <string>,
"groupLabels": <object>,
"commonLabels": <object>,
"commonAnnotations": <object>,
"externalURL": <string>, // backlink to the Alertmanager.
"alerts": [
{
"status": "<resolved|firing>",
"labels": <object>,
"annotations": <object>,
"startsAt": "<rfc3339>",
"endsAt": "<rfc3339>",
"generatorURL": <string> // identifies the entity that caused the alert
},
...
]
}