Prometheus(2)安装和配置

安装Prometheus主服务

下载

安装的操作系统是Ubuntu 18.04。

首先需要获取prometheus的安装包,可以去Prometheus页面下载或者直接使用wget命令。

1
$ wget https://github.com/prometheus/prometheus/releases/download/v2.37.8/prometheus-2.37.8.linux-amd64.tar.gz

解压到/home/work/app文件夹。

1
$ tar xvfz prometheus-2.43.1.linux-amd64.tar.gz

此时所有的安装文件都在/home/work/app/prometheus-2.43.1.linux-amd64内。

1
2
$ ls
console_libraries consoles LICENSE NOTICE prometheus prometheus.yml promtool

现在就可以直接使用命令启动Prometheus服务,此时Prometheus的数据库默认存储在./data文件夹里面。

1
$ ./prometheus --config.file=prometheus.yml

配置systemd

但是手动启动之后,如果退出控制台,整个程序也会终止,所以需要让系统systemd去控制Prometheus进程。

创建文件/etc/systemd/system/prometheus.service写入配置。

1
2
3
4
5
6
7
8
9
10
11
[Unit]
Description=Prometheus
Documentation=https://prometheus.io/
After=network.target
[Service]
Type=simple
User=root
ExecStart=/home/work/app/prometheus-2.43.1.linux-amd64/prometheus --config.file=/home/work/app/prometheus-2.43.1.linux-amd64/prometheus.yml
Restart=on-failure
[Install]
WantedBy=multi-user.target

重新加载daemon配置。

1
$ sudo systemctl daemon-reload

启用Prometheus系统服务。

1
$ sudo systemctl enable prometheus.service

启动Prometheus服务。

1
$ sudo systemctl start prometheus.service

查看Prometheus服务状态。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
$ sudo systemctl status prometheus.service
● prometheus.service - Prometheus
Loaded: loaded (/etc/systemd/system/prometheus.service; enabled; vendor preset: enabled)
Active: active (running) since Mon 2023-06-12 23:07:50 CST; 1 weeks 1 days ago
Docs: https://prometheus.io/
Main PID: 27412 (prometheus)
Tasks: 11 (limit: 4915)
CGroup: /system.slice/prometheus.service
└─27412 /home/work/app/prometheus/prometheus-2.43.1.linux-amd64/prometheus --config.file=/home/work/app/prometheus/prometheus-2.43.1.linux-amd64/prometheus.yml

Jun 21 13:00:01 VM-0-5-ubuntu prometheus[27412]: ts=2023-06-21T05:00:01.857Z caller=compact.go:519 level=info component=tsdb msg="write block" mint=1687312800703 maxt=1687320000000 ulid=01H3E55TS1CH72Y1BPS662KEND duration=159.6571ms
Jun 21 13:00:01 VM-0-5-ubuntu prometheus[27412]: ts=2023-06-21T05:00:01.861Z caller=head.go:1269 level=info component=tsdb msg="Head GC completed" caller=truncateMemory duration=3.178198ms
Jun 21 13:00:01 VM-0-5-ubuntu prometheus[27412]: ts=2023-06-21T05:00:01.862Z caller=checkpoint.go:100 level=info component=tsdb msg="Creating checkpoint" from_segment=455 to_segment=456 mint=1687320000000
Jun 21 13:00:02 VM-0-5-ubuntu prometheus[27412]: ts=2023-06-21T05:00:02.032Z caller=head.go:1241 level=info component=tsdb msg="WAL checkpoint complete" first=455 last=456 duration=170.109287ms
Jun 21 13:00:02 VM-0-5-ubuntu prometheus[27412]: ts=2023-06-21T05:00:02.311Z caller=compact.go:460 level=info component=tsdb msg="compact blocks" count=3 mint=1687284000703 maxt=1687305600000 ulid=01H3E55V3HEDWFD78GZ58N65SZ sources="[01H3D9PXS041Jun 21 13:00:02 VM-0-5-ubuntu prometheus[27412]: ts=2023-06-21T05:00:02.316Z caller=db.go:1548 level=info component=tsdb msg="Deleting obsolete block" block=01H3DGJN10G8HEKJ1FHMSEADPY
Jun 21 13:00:02 VM-0-5-ubuntu prometheus[27412]: ts=2023-06-21T05:00:02.321Z caller=db.go:1548 level=info component=tsdb msg="Deleting obsolete block" block=01H3DQEC90H8J6WDCKKS7QP4P0
Jun 21 13:00:02 VM-0-5-ubuntu prometheus[27412]: ts=2023-06-21T05:00:02.330Z caller=db.go:1548 level=info component=tsdb msg="Deleting obsolete block" block=01H3D9PXS0410D5KPG37H4YQ20
Jun 21 15:00:01 VM-0-5-ubuntu prometheus[27412]: ts=2023-06-21T07:00:01.953Z caller=compact.go:519 level=info component=tsdb msg="write block" mint=1687320000703 maxt=1687327200000 ulid=01H3EC1J11F1VG6Q281TYVBXPM duration=256.052451ms
Jun 21 15:00:01 VM-0-5-ubuntu prometheus[27412]: ts=2023-06-21T07:00:01.958Z caller=head.go:1269 level=info component=tsdb msg="Head GC completed" caller=truncateMemory duration=3.306586ms

显示服务运行正常。

再查看一下默认的Prometheus端口9090。

1
2
3
4
$ netstat -nlpt | grep 9090
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp6 0 0 :::9090 :::* LISTEN -

说明是处于正常监听状态,服务正常。

安装Node_Exporter

整个过程和安装Prometheus非常类似。

首先获取node_exporter安装包。

1
$ wget https://github.com/prometheus/node_exporter/releases/download/v1.5.0/node_exporter-1.5.0.linux-amd64.tar.gz

解压到目录。

1
$ tar xvfz node_exporter-1.5.0.linux-amd64.tar.gz

配置systemd管理服务,增加文件/etc/systemd/system/node_exporter.service

1
2
3
4
5
6
7
8
9
10
11
[Unit]
Description=node_exporter
Documentation=https://prometheus.io/
After=network.target
[Service]
Type=simple
User=root
ExecStart=/home/work/apps/prometheus/node_exporter-1.5.0.linux-amd64/node_exporter
Restart=on-failure
[Install]
WantedBy=multi-user.target

重启daemon,启动exporter,查看状态

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
$ sudo systemctl daemon-reload 
$ sudo systemctl start node_exporter.service
$ sudo systemctl enable node_exporter.service
$ sudo systemctl status node_exporter.service
● node_exporter.service - node_exported
Loaded: loaded (/etc/systemd/system/node_exporter.service; enabled; vendor preset: enabled)
Active: active (running) since Sun 2023-05-14 13:41:38 CST; 1 months 7 days ago
Docs: https://prometheus.io/
Main PID: 15620 (node_exporter)
Tasks: 5 (limit: 4915)
CGroup: /system.slice/node_exporter.service
└─15620 /home/ubuntu/app/prometheus/node_exporter-1.5.0.linux-amd64/node_exporter

May 14 13:41:38 VM-0-5-ubuntu node_exporter[15620]: ts=2023-05-14T05:41:38.818Z caller=node_exporter.go:117 level=info collector=thermal_zone
May 14 13:41:38 VM-0-5-ubuntu node_exporter[15620]: ts=2023-05-14T05:41:38.818Z caller=node_exporter.go:117 level=info collector=time
May 14 13:41:38 VM-0-5-ubuntu node_exporter[15620]: ts=2023-05-14T05:41:38.818Z caller=node_exporter.go:117 level=info collector=timex
May 14 13:41:38 VM-0-5-ubuntu node_exporter[15620]: ts=2023-05-14T05:41:38.819Z caller=node_exporter.go:117 level=info collector=udp_queues
May 14 13:41:38 VM-0-5-ubuntu node_exporter[15620]: ts=2023-05-14T05:41:38.819Z caller=node_exporter.go:117 level=info collector=uname
May 14 13:41:38 VM-0-5-ubuntu node_exporter[15620]: ts=2023-05-14T05:41:38.819Z caller=node_exporter.go:117 level=info collector=vmstat
May 14 13:41:38 VM-0-5-ubuntu node_exporter[15620]: ts=2023-05-14T05:41:38.819Z caller=node_exporter.go:117 level=info collector=xfs
May 14 13:41:38 VM-0-5-ubuntu node_exporter[15620]: ts=2023-05-14T05:41:38.819Z caller=node_exporter.go:117 level=info collector=zfs
May 14 13:41:38 VM-0-5-ubuntu node_exporter[15620]: ts=2023-05-14T05:41:38.819Z caller=tls_config.go:232 level=info msg="Listening on" address=[::]:9100
May 14 13:41:38 VM-0-5-ubuntu node_exporter[15620]: ts=2023-05-14T05:41:38.819Z caller=tls_config.go:235 level=info msg="TLS is disabled." http2=false address=[::]:9100

配置Prometheus主程序

Prometheus的所有配置都在prometheus.yml文件内,解压之后在文件夹内就可以看到。

1
2
3
4
5
6
7
8
9
10
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: "web-server"

# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
scrape_interval: 5s
metrics_path: '/actuator/prometheus'
static_configs:
- targets: ["localhost:8084", "localhost:8085"]

这里每一个job_name下面都是一个抓取的任务,对应类似一个服务,有下面几个主要配置:

  • job_name: 抓取任务的名称
  • scrape_interval: 抓取的事件间隔,每5s抓取一次,默认是1分钟
  • metrics_path: 抓取请求的路径,这里是localhost:8084/actuator/prometheus,如果不配置默认的是/metrics
  • static_configs:
    • targets: 目标主机ip和端口,可以配置多个

每次修改了配置之后,重启Prometheus之前,都需要检查一下配置是否正确。

1
$ ./promtool check config prometheus.yml

最后重启一下Prometheus服务。

1
$ sudo systemctl restart prometheus.service