外部prometheus监控k8s

新建RBAC

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
cat prometheus-rbac.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: prometheus
namespace: kube-system
labels:
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: prometheus
labels:
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
rules:
- apiGroups:
- ""
resources:
- nodes
- nodes/metrics
- services
- endpoints
- pods
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- configmaps
verbs:
- get
- nonResourceURLs:
- "/metrics"
verbs:
- get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: prometheus
labels:
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: prometheus
subjects:
- kind: ServiceAccount
name: prometheus
namespace: kube-system

获取ca.crt和token

1
2
3
kubectl get sa prometheus -o yaml -n kube-system
kubectl get secrets prometheus-token-tn6ww -n kube-system -o yaml
base64解码ca.crt和token

配置prometheus的Job

1
2
3
4
5
6
7
8
9
10
11
12
- job_name: 'k8s-test-cadvisor'
scheme: https
scrape_interval: 10s
tls_config:
ca_file: /opt/prometheus/etc/ca.crt #配置证书
insecure_skip_verify: true
bearer_token_file: /opt/prometheus/etc/token #配置token
metrics_path: /metrics/cadvisor
file_sd_configs:
- refresh_interval: 30s #重载配置文件间隔
files:
- /opt/prometheus/etc/targets/target_k8s.json

配置confd

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
prometheus_discovery_k8s.tmpl
[
{{- range $index, $info := getvs "/prometheus/discovery/k8s/*" -}}
{{- $data := json $info -}}
{{- if ne $index 0 }},{{- end }}
{
"targets": [
"{{$data.address}}"
],
"labels":{
"node": "{{$data.name}}"
{{- if $data.labels -}}
{{- range $data.labels -}}
,"{{.key}}": "{{.val}}"
{{- end}}
{{- end}}
}
}{{- end }}
]

prometheus_discovery_k8s.toml
[template]
src = "prometheus_discovery_k8s.tmpl"
dest = "/opt/prometheus/etc/targets/target_k8s.json"
mode = "0777"
keys = [
"/prometheus/discovery/k8s",
]
reload_cmd = "curl -XPOST 'http://127.0.0.1:9090/-/reload'"

模拟自动发现

1
etcdctl put /prometheus/discovery/k8s/node01 '{"name":"node01","address":"10.200.1.205:10250","labels":[{"key":"label1","val":"test1"},{"key":"label2","val":"test2"}]}'

部署kube-state-metrics服务

1
2
3
4
5
6
7
8
9
10
访问https://github.com/kubernetes/kube-state-metrics/tree/master/examples/standard 获取k8s资源清单
k8s版本v1.20.4 kube-state-metrics版本v2.0.0
dockerhub.codoon.com/kube-state-metrics/kube-state-metrics:v2.0.0
注意替换镜像,调整service为NodePort
配置prometheus Job
- job_name: 'k8s-test-kube-state'
scrape_interval: 10s
static_configs:
- targets:
- '10.200.1.205:8879'

说明

1
2
1.pod等信息通过kubelet默认集成cAdvisor获取
2.其他k8s资源信息需要通过kube-state-metrics服务获取