kube-promethues配置釘釘告警 前置:k8s部署kube-promethues 一.配置釘釘機器人 打開釘釘的智能群助手,點擊添加機器人 選擇自定義機器人 勾選加簽,複製後保存 複製webhook地址後點擊保存 二.編寫dingtalk的yaml部署文件 vi dingtalk.yaml ...
kube-promethues配置釘釘告警
前置:k8s部署kube-promethues
一.配置釘釘機器人
-
打開釘釘的智能群助手,點擊添加機器人
-
選擇自定義機器人
-
勾選加簽,複製後保存
-
複製webhook地址後點擊保存
二.編寫dingtalk的yaml部署文件
vi dingtalk.yaml
apiVersion: v1
kind: Service
metadata:
name: dingtalk
namespace: monitoring
spec:
selector:
app: dingtalk
ports:
- name: http
protocol: TCP
port: 8060
targetPort: 8060
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: dingtalk
namespace: monitoring
labels:
app: dingtalk
spec:
replicas: 1
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
selector:
matchLabels:
app: dingtalk
template:
metadata:
labels:
app: dingtalk
spec:
restartPolicy: "Always"
containers:
- name: dingtalk
image: timonwong/prometheus-webhook-dingtalk:v2.1.0
imagePullPolicy: "IfNotPresent"
volumeMounts:
- name: dingtalk-conf
mountPath: /etc/prometheus-webhook-dingtalk/
resources:
limits:
cpu: "400m"
memory: "500Mi"
requests:
cpu: "100m"
memory: "100Mi"
ports:
- containerPort: 8060
name: http
protocol: TCP
readinessProbe:
failureThreshold: 3
periodSeconds: 5
initialDelaySeconds: 30
successThreshold: 1
tcpSocket:
port: 8060
livenessProbe:
tcpSocket:
port: 8060
initialDelaySeconds: 30
periodSeconds: 10
volumes:
- name: dingtalk-conf
configMap:
name: dingtalk-cm
prometheus-webhook-dingtalk是一個開源的釘釘告警的插件,目前最新版停留於v2.1.0
三.編寫釘釘告警模板dingtalk-configmap.yaml
vi dingtalk-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: dingtalk-cm
namespace: monitoring
data:
config.yml: |-
templates:
- /etc/prometheus-webhook-dingtalk/dingding.tmpl
targets:
webhook:
url: https://oapi.dingtalk.com/robot/send?access_token=<複製的webhook地址>
secret: "<加簽的時候複製的secret>"
message:
text: '{{ template "dingtalk.to.message" . }}'
dingding.tmpl: |-
{{ define "dingtalk.to.message" }}
{{- if gt (len .Alerts.Firing) 0 -}}
{{- range $index, $alert := .Alerts -}}
========= **監控告警** =========
**告警集群:** k8s
**告警類型:** {{ $alert.Labels.alertname }}
**告警級別:** {{ $alert.Labels.severity }}
**告警狀態:** {{ .Status }}
**故障主機:** {{ $alert.Labels.instance }} {{ $alert.Labels.device }}
**告警主題:** {{ .Annotations.summary }}
**告警詳情:** {{ $alert.Annotations.message }}{{ $alert.Annotations.description}}
**主機標簽:** {{ range .Labels.SortedPairs }} </br> [{{ .Name }}: {{ .Value | markdown | html }} ]
{{- end }} </br>
**故障時間:** {{ ($alert.StartsAt.Add 28800e9).Format "2006-01-02 15:04:05" }}
========= = **end** = =========
{{- end }}
{{- end }}
{{- if gt (len .Alerts.Resolved) 0 -}}
{{- range $index, $alert := .Alerts -}}
========= **故障恢復** =========
**告警集群:** k8s
**告警主題:** {{ $alert.Annotations.summary }}
**告警主機:** {{ .Labels.instance }}
**告警類型:** {{ .Labels.alertname }}
**告警級別:** {{ $alert.Labels.severity }}
**告警狀態:** {{ .Status }}
**告警詳情:** {{ $alert.Annotations.message }}{{ $alert.Annotations.description}}
**故障時間:** {{ ($alert.StartsAt.Add 28800e9).Format "2006-01-02 15:04:05" }}
**恢復時間:** {{ ($alert.EndsAt.Add 28800e9).Format "2006-01-02 15:04:05" }}
========= = **end** = =========
{{- end }}
{{- end }}
{{- end }}
四.編寫文件alertmanager-secret.yaml
該文件是 用來頂替原本kube-promethues部署時的,alertmanager的配置文件
vi alertmanager-secret.yaml
apiVersion: v1
data: { }
kind: Secret
metadata:
name: alertmanager-main
namespace: monitoring
stringData:
alertmanager.yaml: |-
global:
resolve_timeout: 5m
route:
group_by: ['alertname']
group_wait: 30s
group_interval: 5m
repeat_interval: 30m
receiver: 'webhook'
routes:
- match:
severity: 'info'
continue: true
receiver: 'null'
- match:
severity: 'none'
continue: true
receiver: 'null'
receivers:
- name: 'null'
- name: 'webhook'
webhook_configs:
- send_resolved: true
url: 'http://dingtalk:8060/dingtalk/webhook/send'
五.部署並檢查是否運行成功
kubectl apply -f alertmanager-secret.yaml
kubectl apply -f dingtalk-configmap.yaml
kubectl apply -f dingtalk.yaml
#查看是否部署成功
kubectl get pods -n monitoring | grep dingtalk
dingtalk部署成功後,重新部署alertmanager就行了。