最近在學著使用blackwidow這個工具,在ubuntu20.02系統的安裝過程當中遇到了selenium使用的一些問題。 selenium是個什麼工具? Selenium是一個用於Web應用程式測試的工具。Selenium測試直接運行在瀏覽器中,就像真正的用戶在操作一樣。支持的瀏覽器包括IE(7 ...
第一節、環境和軟體版本
1.1、操作系統環境
主機ip | 操作系統 | 部署軟體 | 備註 |
---|---|---|---|
192.168.10.10 | Centos7.9 | Grafana、Pushgateway、Blackbox Exporter | 監控ui |
192.168.10.11 | Centos7.9 | Loki | 日誌存儲 |
192.168.10.12 | Centos7.9 | Promethues | 存儲監控指標 |
192.168.10.13 | Centos7.9 | logstash | 日誌過濾 |
192.168.10.14 | Centos7.9 | Filebeat、node_exporter | 日誌和監控指標採集 |
192.168.10.15 | Windows server2016 | Filebeat、node_exporter | 日誌和監控指標採集 |
1.2、軟體版本
軟體名稱 | 版本 | 備註 |
---|---|---|
grafana | 8.3.3 | 監控ui |
Loki | 2.5.0 | 日誌存儲 |
promethues | 2.32.1 | 監控指標存儲 |
pushgateway | 1.4.2 | 接收自定義監控指標 |
filebeat | 6.4.3 | 日誌採集客戶端 |
node_exporter | 1.3.1 | 監控指標採集客戶端 |
logstash | 7.16.2 | 日誌過濾 |
Blackbox Exporter | 0.19.0 | 監控網站、http\tcp\udp等 |
1.3、系統初始化
1、關閉防火牆
systemctl stop firewalld
systemctl disable firewalld
2、關閉selinux
setenforce 0
vim /etc/selinux/config
SELINUX=disabled
1.4、架構圖
第二節、監控平臺部署
2.1、服務端部署
1、grafana
提示:主機192.168.10.10操作
安裝
tar -xvf grafana-8.3.3.linux-amd64.tar
cd grafana-8.3.3/
啟動
nohup ./bin/grafana-server > ./log/grafana.log &
瀏覽器訪問:http://192.168.10.10:3000
用戶名和密碼:admin/admin
2、promethues
提示:主機192.168.10.12操作
安裝
tar -xvf prometheus-2.32.1.linux-amd64.tar
cd prometheus-2.32.1.linux-amd64/
啟動
nohup ./prometheus --config.file=./prometheus.yml --web.listen-address=:49800 1>nohup.log 2>&1 &
瀏覽器訪問:http://192.168.10.12:49800
3、grafana集成promethues
在grafana添加數據源promethues,具體步驟如圖
2.2、客戶端部署
1、linuxx系統
- 安裝
部署node_exporter,解壓tar包即可
tar -xvf node_exporter-1.3.1.linux-amd64.tar.gz
cd node_exporter-1.3.1.linux-amd64/
- 啟動
nohup ./node_exporter --web.listen-address=:49999 --log.format=logfmt --collector.textfile.directory=./collection.textfile.directory/ --collector.ntp.server-is-local >/dev/null &
2、windows系統
- 安裝
Widows安裝解壓即可
- 編寫啟動腳本startNode.bat
start /b "" .\windows_exporter-0.17.0-amd64.exe --telemetry.addr=":9182" --collector.textfile.directory="./collection.textfile.directory/"
- 啟動
雙擊啟動腳本即可,如下圖
3、配置promethues
- 編寫配置文件
vi prometheus.yml
- job_name: "NODE"
static_configs:
- targets: ['192.168.10.14:49999']
labels:
env: prd001
group: PAAS
hostip: 192.168.10.14
- targets: ['192.168.10.15:9182']
labels:
env: prd001
group: PAAS
hostip: 192.168.10.15
- 重啟promethues
nohup ./prometheus --config.file=./prometheus.yml --web.listen-address=:49800 1>nohup.log 2>&1 &
- 查看promethues
4、配置grafana並查看 - 導入監控模板
在grafan導入監控windows和linux模板,Windows模板編號:10467,Linux模板編號:11074,具體操作如下圖
- 查看linux面板
- 查看windows面板
第三節、部署日誌平臺
3.1、安裝服務端
1、安裝
tar -xvf loki.tar.gz
cd loki/
啟動
nohup ./loki-linux-amd64 -config.file=config.yaml 1> ./log/loki.log & 2> ./log/loki_error.log &
ss -tunlp | grep 3100
tcp LISTEN 0 128 [::]:3100 [::]:* users:(("loki-linux-amd6",pid=8422,fd=10))
2、配置grafana
3.2、部署logstash
tar -xvf logstash-7.16.2.tar
cd logstash-7.16.2/
bin/logstash-plugin install file:///bankapp/logstash/plugin/logstash-codec-plain.zip
bin/logstash-plugin install file:///bankapp/logstash/plugin/logstash-output-loki.zip
vi pipelines/log_collect.conf
input{
beats {
port => 10515
}
}
input{
http {
host => "0.0.0.0"
port => 10516
type => "healthcheck"
}
}
filter {
grok{
match => {
"message" => ".*\[INFO\] \[(?<funcname>(.*?)):.*"
}
}
grok {
match => ["message", "%{TIMESTAMP_ISO8601:logdate}"]
}
if [appname] == "switch" {
date {
match => ["logdate", "yyyy-MM-dd HH:mm:ss.SSS"]
target => "@timestamp" ## 榛樿target灝辨槸"@timestamp
}
}else {
date {
match => ["logdate", "yyyy-MM-dd'T'HH:mm:ss.SSS"]
target => "@timestamp" ## 榛樿target灝辨槸"@timestamp
}
}
mutate {
remove_field => ["tags"]
remove_field => ["offset"]
remove_field => ["logdate"]
}
}
output {
if [type] == "healthcheck" {
}else{
loki {
url => "http://192.168.10.10:3100/loki/api/v1/push"
batch_size => 112640 #112.64 kilobytes
retries => 5
min_delay => 3
max_delay => 500
message_field => "message"
}
}
啟動
nohup ./bin/logstash -f ./pipelines/log_collect.conf 1>nohup.loog 2>nohup.log &
3.3、部署客戶端filebeat
日誌格式如下
gtms-switch-center 2022-04-19 17:28:14.616 [http-nio-8080-exec-989] INFO c.p.switchcenter.web.controller.SwitchController
1、linux系統
- 安裝
tar -xvf filebeat.tar.gz
cd filebeat/
- 編寫配置文件
vi filebeat.yml
filebeat.prospectors:
- input_type: log
paths:
- /bankapp/switch/gtms-switch-center/com.pactera.jep.log.biz*.log
multiline:
pattern: '^gtms-switch-center'
negate: true
match: after
max_lines: 200
timeout: 20s
fields:
env: "prd001"
appid: "switch"
appname: "switch"
hostip: "192.168.10.15"
reload.enabled: true
reload.period: 2S
fields_under_root: true
output.logstash:
hosts: ["192.168.10.11:10515" ]
enabled: true
- 啟動
nohup ./filebeat -e -c filebeat.yml -d "publish" 1>/dev/null 2>&1 &
2、windows系統
- windows安裝直接解壓即可,解壓如下
- 編寫配置文件filebeat.yml
filebeat.prospectors:
- input_type: log
encoding: gbk
paths:
- C:/bankapp/switch/gtms-switch-center/com.pactera.jep.log.biz*.log
multiline:
pattern: '^gtms-switch-center'
negate: true
match: after
max_lines: 200
timeout: 20s
fields:
env: "prd001"
appid: "switch"
appname: "switch"
hostip: "192.168.10.16"
reload.enabled: true
reload.period: 2S
fields_under_root: true
output.logstash:
hosts: ["192.168.10.11:10515" ]
enabled: true
- 編寫後臺啟動腳本startFilebeat.vbs
set ws=WScript.CreateObject("WScript.Shell")
ws.Run "filebeat.exe -e -c filebeat.yml",0
- 啟動,雙擊腳本startFilebeat.vbs
3.4、grafana查看日誌
用grafana查看日誌,可以根據自己的刪選條件(關鍵字、時間等)選擇查詢響應的日誌信息,具體如圖
第四節、自定義監控
自定義監控可以根據自己編寫的腳本,把需要監控的監控指標發送到pushgateway上,最後存儲在promethues,使用grafana查看。
4.1、pushgateway
1、部署pushgateway
tar -xvf pushgateway-1.4.2.linux-amd64.tar.gz
cd pushgateway-1.4.2.linux-amd64/
啟動
nohup ./pushgateway --web.listen-address=:48888 1>nohup.log 2>&1 &
2、promethues集成pushgateway
- 編輯配置文件
vi prometheus.yml
- job_name: 'pushgateway'
static_configs:
- targets: [‘192.168.10.10:48888']
labels:
instance: pushgateway
- 重啟prometheus
nohup ./prometheus --config.file=./prometheus.yml --web.listen-address=:49800 1>nohup.log 2>&1 &
提示:停掉prometheus,再次啟動
4.2、監控jvm
1、編寫監控jvm腳本並運行
編寫腳本
vi jvm_stat_exporter.sh
!# /bin/ksh
echo "start ..."
#JAVA_PROCESS_LIST=`jps | grep -v " Jps$" | grep -v " Jstat$"`
#echo $JAVA_PROCESS_LIST
HOST_IP=`ifconfig -a|grep inet|grep -v 127.0.0.1|grep -v 192.168|grep -v inet6|awk '{print $2}'|tr -d "addr:"`
#echo "$HOST_IP"
push_jvm_stat()
{
line=$1
#echo $line
PID=`echo $line | cut -d ' ' -f 1`
PNAME=`echo $line | cut -d ' ' -f 2`
#echo "PID:$PID,HOST_IP:$HOST_IP,PNAME:$PNAME"
GC_LINE=`jstat -gc $PID | tail -1`
#echo "$GC_LINE"
# S0C S1C S0U S1U EC EU OC OU MC MU CCSC CCSU YGC YGCT FGC FGCT GCT
# S0C
S0C=`echo $GC_LINE | cut -d ' ' -f 1`
S1C=`echo $GC_LINE | cut -d ' ' -f 2`
S0U=`echo $GC_LINE | cut -d ' ' -f 3`
S1U=`echo $GC_LINE | cut -d ' ' -f 4`
EC=`echo $GC_LINE | cut -d ' ' -f 5`
EU=`echo $GC_LINE | cut -d ' ' -f 6`
OC=`echo $GC_LINE | cut -d ' ' -f 7`
OU=`echo $GC_LINE | cut -d ' ' -f 8`
MC=`echo $GC_LINE | cut -d ' ' -f 9`
MU=`echo $GC_LINE | cut -d ' ' -f 10`
CCSC=`echo $GC_LINE | cut -d ' ' -f 11`
CCSU=`echo $GC_LINE | cut -d ' ' -f 12`
YGC=`echo $GC_LINE | cut -d ' ' -f 13`
YGCT=`echo $GC_LINE | cut -d ' ' -f 14`
FGC=`echo $GC_LINE | cut -d ' ' -f 15`
FGCT=`echo $GC_LINE | cut -d ' ' -f 16`
GCT=`echo $GC_LINE | cut -d ' ' -f 17`
#echo $S0C $S1C $S0U $S1U $EC $EU $OC $OU $MC $MU $CCSC $CCSU $YGC $YGCT $FGC $FGCT $GCT
#echo "******* $HOST_IP $PNAME *******"
cat <<EOF | curl --data-binary @- http://192.168.10.10:48888/metrics/job/test_jvm_job/instance/${HOST_IP}_$PNAME
# TYPE jvm_s0c gauge
jvm_s0c{processname="$PNAME",hostip="$HOST_IP"} $S0C
# TYPE jvm_s1c gauge
jvm_s1c{processname="$PNAME",hostip="$HOST_IP"} $S1C
# TYPE jvm_s0u gauge
jvm_s0u{processname="$PNAME",hostip="$HOST_IP"} $S0U
# TYPE jvm_s1u gauge
jvm_s1u{processname="$PNAME",hostip="$HOST_IP"} $S1U
# TYPE jvm_ec gauge
jvm_ec{processname="$PNAME",hostip="$HOST_IP"} $EC
# TYPE jvm_eu gauge
jvm_eu{processname="$PNAME",hostip="$HOST_IP"} $EU
# TYPE jvm_oc gauge
jvm_oc{processname="$PNAME",hostip="$HOST_IP"} $OC
# TYPE jvm_ou gauge
jvm_ou{processname="$PNAME",hostip="$HOST_IP"} $OU
# TYPE jvm_mc gauge
jvm_mc{processname="$PNAME",hostip="$HOST_IP"} $MC
# TYPE jvm_mu gauge
jvm_mu{processname="$PNAME",hostip="$HOST_IP"} $MU
# TYPE jvm_ccsc gauge
jvm_ccsc{processname="$PNAME",hostip="$HOST_IP"} $CCSC
# TYPE jvm_ccsu gauge
jvm_ccsu{processname="$PNAME",hostip="$HOST_IP"} $CCSU
# TYPE jvm_ygc counter
jvm_ygc{processname="$PNAME",hostip="$HOST_IP"} $YGC
# TYPE jvm_ygct counter
jvm_ygct{processname="$PNAME",hostip="$HOST_IP"} $YGCT
# TYPE jvm_fgc counter
jvm_fgc{processname="$PNAME",hostip="$HOST_IP"} $FGC
# TYPE jvm_fgct counter
jvm_fgct{processname="$PNAME",hostip="$HOST_IP"} $FGCT
# TYPE jvm_gct counter
jvm_gct{processname="$PNAME",hostip="$HOST_IP"} $GCT
EOF
# echo "******* $PNAME 2 *******"
}
while [ 1 = 1 ]
do
jps |grep -v " Jps$" | grep -v " Jstat$" | while read line_jps
do
push_jvm_stat "$line_jps"
done
echo "`date` pushed" > ./lastpushed.log
sleep 5
done
授權並運行腳本
chmod +x jvm_stat_exporter.sh
./jvm_stat_exporter.sh
2、查看jvm指標
- 在pushgateway查看如下圖
- 在grafana查看監控指標如下
第五節、監控服務
5.1、部署Blackbox Exporter
1、安裝
tar -xvf blackbox_exporter-0.19.0.linux-amd64.tar.gz
cd blackbox_exporter-0.19.0.linux-amd64/
2、啟動
nohup ./blackbox_exporter &
3、訪問
瀏覽器訪問http://192.168.10.10:9115
5.2、監控埠
1、配置promethues集成blackbox_exporter監控埠22
- job_name: 'prometheus_port_status'
metrics_path: /probe
params:
module: [tcp_connect]
static_configs:
- targets: ['192.168.10.14:22]
labels:
instance: port_22_ssh
hostip: 192.168.10.14
group: 'tcp'
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- target_label: __address__
replacement: 192.168.10.10:9115
2、重啟prometheus
nohup ./prometheus --config.file=./prometheus.yml --web.listen-address=:49800 1>nohup.log 2>&1 &
提示:停掉prometheus,再次啟動
5.3、監控http
1、配置promethues集成blackbox_exporter監控http
- job_name: web_status
metrics_path: /probe
params:
module: [http_2xx]
static_configs:
- targets: ['http://192.168.10.15:8080]
labels:
instance: starweb
hostip: 192.168.10.15
group: 'web'
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- target_label: __address__
replacement: 192.168.10.10:9115
2、重啟prometheus
nohup ./prometheus --config.file=./prometheus.yml --web.listen-address=:49800 1>nohup.log 2>&1 &
提示:停掉prometheus,再次啟動
其他
prometheus.yml
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: "prometheus"
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ["localhost:8080"]
- job_name: 'pushgateway'
static_configs:
- targets: ['localhost:48888']
labels:
instance: pushgateway
- job_name: "NODE"
static_configs:
# PAAS
- targets: ['10.0.14.206:49999']
labels:
env: prd001
group: PAAS
hostip: 10.0.14.206
- targets: ['10.0.14.205:49999']
labels:
env: prd001
group: APP
hostip: 10.0.14.205
# NGINX
- targets: ['10.0.14.200:49999']
labels:
env: prd001
group: NGINX
hostip: 10.0.14.200
- targets: ['10.0.14.201:49999']
labels:
env: prd001
group: NGINX
hostip: 10.0.14.201
- targets: ['10.0.14.202:49999']
labels:
env: prd001
group: NGINX
hostip: 10.0.14.202
- targets: ['10.0.14.203:49999']
labels:
env: prd001
group: NGINX
hostip: 10.0.14.203
# SWITCH
- targets: ['10.0.14.209:49999']
labels:
env: prd001
group: SWITCH
hostip: 10.0.14.209
- targets: ['10.0.14.210:49999']
labels:
env: prd001
group: SWITCH
hostip: 10.0.14.210
- targets: ['10.0.14.211:49999']
labels:
env: prd001
group: SWITCH
hostip: 10.0.14.211
- targets: ['10.0.14.214:49999']
labels:
env: prd001
group: LOGSTASH
hostip: 10.0.14.214
- targets: ['10.0.14.215:49999']
labels:
env: prd001
group: LOGSTASH
hostip: 10.0.14.215
- targets: ['10.0.14.221:49999']
labels:
env: prd001
group: SWITCH
hostip: 10.0.14.221
- targets: ['10.0.14.216:49999']
labels:
env: prd001
group: TOOLS
hostip: 10.0.14.216
- job_name: 'prometheus_port_status'
metrics_path: /probe
params:
module: [tcp_connect]
static_configs:
- targets: ['10.0.14.222:6789']
labels:
instance: port_6789_PINGAN
hostip: 10.0.14.222
group: 'tcp'
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- target_label: __address__
replacement: 127.0.0.1:9115
- job_name: web_status
metrics_path: /probe
params:
module: [http_2xx]
static_configs:
- targets: ['https://star.moutai.com.cn/nginx_status']
labels:
instance: starweb
hostip: 10.0.14.241
group: 'web'
- targets: ['http://10.0.14.226:8046/actuator/health']
labels:
instance: gtms-service-business
hostip: 10.0.14.226
group: 'web'
- targets: ['http://10.0.14.225:8043/actuator/health']
labels:
instance: gtms-service-gateway
hostip: 10.0.14.225
group: 'web'
- targets: ['http://10.0.14.226:8043/actuator/health']
labels:
instance: gtms-service-gateway
hostip: 10.0.14.226
group: 'web'
- targets: ['http://10.0.14.204:8047/actuator/health']
labels:
instance: gtms-service-job
hostip: 10.0.14.204
group: 'web'
- targets: ['http://10.0.14.205:8047/actuator/health']
labels:
instance: gtms-service-job
hostip: 10.0.14.205
group: 'web'
- targets: ['http://10.0.14.225:9080/star-api/actuator/health']
labels:
instance: ijep-router-zuul-star-api
hostip: 10.0.14.225
group: 'web'
- targets: ['http://10.0.14.226:9080/star-api/actuator/health']
labels:
instance: ijep-router-zuul-star-api
hostip: 10.0.14.226
group: 'web'
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- target_label: __address__
replacement: 127.0.0.1:9115
loki-config.yaml
auth_enabled: false
server:
http_listen_port: 3100
ingester:
lifecycler:
address: 127.0.0.1
ring:
kvstore:
store: inmemory
replication_factor: 1
final_sleep: 0s
chunk_idle_period: 5m
chunk_retain_period: 30s
schema_config:
configs:
- from: 2020-05-15
store: boltdb-shipper
object_store: filesystem
schema: v11
index:
prefix: index_
period: 24h
storage_config:
boltdb_shipper:
active_index_directory: /bankapp/loki/data/index
cache_location: /bankapp/loki/data/index/cache
shared_store: filesystem
filesystem:
directory: /bankapp/loki/data/chunks
limits_config:
enforce_metric_name: false
reject_old_samples: true
reject_old_samples_max_age: 336h
per_stream_rate_limit: "30MB"
ingestion_rate_mb: 50
retention_period: 336h
compactor:
working_directory: /bankapp/loki/data/compactor
shared_store: filesystem
compaction_interval: 10m
retention_enabled: true
FAQ
1、loki接收不到日誌或者promethues獲取不到監控指標登錄
解決辦法:
查看防火牆規則是否放開,或者直接關閉防火牆(生產環境不建議關閉防火牆)