欢迎访问我的GitHub
这里分类和汇总了欣宸的全部原创(含配套源码):https://github.com/zq2599/blog_demos
本篇概览
- 本文详述了在macOS(M2芯片)上编译和运行prometheus2.54版本的过程,以及安装node_exporter和grafana并使用prometheus指标进行展示
本地环境
- 操作系统:macOS Sonoma(14.6.1)
- go版本:1.23.0
- prometheus:2.54
准备工作
- 首先要装好npm,官方编译命令会用到,我这里是用brew安装的
shell
brew install npm
- 执行npm version命令验证是否安装成功
shell
npm version
{
npm: '10.8.3',
node: '22.9.0',
acorn: '8.12.1',
ada: '2.9.0',
amaro: '0.1.8',
ares: '1.33.1',
brotli: '1.1.0',
cjs_module_lexer: '1.4.1',
cldr: '44.1',
icu: '74.2',
llhttp: '9.2.1',
modules: '127',
napi: '9',
nbytes: '0.1.1',
ncrypto: '0.0.1',
nghttp2: '1.63.0',
openssl: '3.3.2',
simdjson: '3.10.0',
simdutf: '5.5.0',
sqlite: '3.46.1',
tz: '2023c',
undici: '6.19.8',
unicode: '15.1',
uv: '1.49.0',
uvwasi: '0.0.21',
v8: '12.4.254.21-node.19',
zlib: '1.2.12'
}
- 在编译开源项目时,本地go版本与项目的go版本经常不一致,这时最好有工具能在本地管理多个go版本,如此用到哪个就切换到哪个,我这里用的是gvm,安装方式如下
shell
bash < <(curl -s -S -L https://raw.githubusercontent.com/moovweb/gvm/master/binscripts/gvm-installer)
- 安装成功后,执行gvm listall查看所有可以安装的go版本
shell
gvm listall
gvm gos (available)
go1
go1.0.1
go1.0.2
go1.0.3
go1.1
go1.1rc2
go1.1rc3
...
...
- 根据prometheus项目的go.mod中的版本信息,要用到1.23.0版本,这里用gvm来安装,要注意的是gvm支持两种方式:下载源码本地编译、直接下载二进制文件,前一种执行以下命令
shell
gvm install go1.23.0
- 注意这种方式对已有go版本有一定要求,例如编译1.23就有要求本地必须要有1.20,如果您嫌麻烦,可以选择第二种方式,即直接下载编译好的二进制文件,命令如下,可见是多了个-B参数
shell
gvm install go1.23.0 -B
- 安装完成后,执行gvm use go1.23.0命令使用
shell
gvm use go1.23.0
Now using version go1.23.0
- 再检查,版本已经切换成功
shell
go version
go version go1.23.0 darwin/arm64
下载promethus源码
- 在prometheus的发布页面选择合适的版本,然后下载源码
- 我这里选择的是2.54.1
编译
- 下载源码后解压,再进入解压后的目录prometheus-2.54.1
- 执行make build开始编译prometheus源码,控制台只要不报错就是成功,最后输出的是一些依赖包的下载信息
shell
go: downloading github.com/mattn/go-colorable v0.1.13
go: downloading github.com/hashicorp/go-immutable-radix v1.3.1
go: downloading github.com/golang/groupcache v0.0.0-20210331224755-41bb18bfe9da
go: downloading github.com/emicklei/go-restful/v3 v3.11.0
go: downloading github.com/hashicorp/golang-lru v0.6.0
> promtool
go: downloading github.com/google/pprof v0.0.0-20240711041743-f6c9dda6c6da
go: downloading github.com/nsf/jsondiff v0.0.0-20230430225905-43f6cf3098c1
- 查看当前目录,通过更新时间发现新生成了两个文件,这就是编译结果
- 现在编译已经完成,接下来验证这个新编译的prometheus是否可用
部署node_exporter
- 验证prometheus是否可用的方法是通过prometheus采集当前电脑的机器指标,然后在prometheus页面上查看
- 首先下载node_exporter,在发布页面选择适合自己电脑的版本,我这里是M2芯片的macOS,因此选择darwin和arm的组合
- 下载后解压,进入node_exporter目录,执行
shell
./node_exporter
- 在mac系统中这样的文件会比阻止运行,需要手动放开限制
- 启动后控制台输出如下,可见有很多个collector,cpu、磁盘、内存等等
shell
./node_exporter
ts=2024-10-04T11:56:36.515Z caller=node_exporter.go:193 level=info msg="Starting node_exporter" version="(version=1.8.2, branch=HEAD, revision=f1e0e8360aa60b6cb5e5cc1560bed348fc2c1895)"
ts=2024-10-04T11:56:36.515Z caller=node_exporter.go:194 level=info msg="Build context" build_context="(go=go1.22.5, platform=darwin/arm64, user=root@dc3a6de96cb1, date=20240714-11:56:30, tags=unknown)"
ts=2024-10-04T11:56:36.516Z caller=filesystem_common.go:111 level=info collector=filesystem msg="Parsed flag --collector.filesystem.mount-points-exclude" flag=^/(dev)($|/)
ts=2024-10-04T11:56:36.516Z caller=filesystem_common.go:113 level=info collector=filesystem msg="Parsed flag --collector.filesystem.fs-types-exclude" flag=^devfs$
ts=2024-10-04T11:56:36.516Z caller=node_exporter.go:111 level=info msg="Enabled collectors"
ts=2024-10-04T11:56:36.516Z caller=node_exporter.go:118 level=info collector=boottime
ts=2024-10-04T11:56:36.516Z caller=node_exporter.go:118 level=info collector=cpu
ts=2024-10-04T11:56:36.516Z caller=node_exporter.go:118 level=info collector=diskstats
ts=2024-10-04T11:56:36.516Z caller=node_exporter.go:118 level=info collector=filesystem
ts=2024-10-04T11:56:36.516Z caller=node_exporter.go:118 level=info collector=loadavg
ts=2024-10-04T11:56:36.516Z caller=node_exporter.go:118 level=info collector=meminfo
ts=2024-10-04T11:56:36.516Z caller=node_exporter.go:118 level=info collector=netdev
ts=2024-10-04T11:56:36.516Z caller=node_exporter.go:118 level=info collector=os
ts=2024-10-04T11:56:36.516Z caller=node_exporter.go:118 level=info collector=powersupplyclass
ts=2024-10-04T11:56:36.516Z caller=node_exporter.go:118 level=info collector=textfile
ts=2024-10-04T11:56:36.516Z caller=node_exporter.go:118 level=info collector=thermal
ts=2024-10-04T11:56:36.516Z caller=node_exporter.go:118 level=info collector=time
ts=2024-10-04T11:56:36.516Z caller=node_exporter.go:118 level=info collector=uname
ts=2024-10-04T11:56:36.517Z caller=tls_config.go:313 level=info msg="Listening on" address=[::]:9100
ts=2024-10-04T11:56:36.517Z caller=tls_config.go:316 level=info msg="TLS is disabled." http2=false address=[::]:9100
prometheus配置
- 现在node_exporter已经启动,就等着prometheus来采集数据了,启动prometheus前先准备好对应的配置文件
- 在构建好的prometheus文件所在目录下,新增config.yml,内容如下
yml
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# 第一个任务是采集prometheus自己的数据
- job_name: "prometheus"
static_configs:
- targets: ["localhost:9090"]
# 第二个任务是采集node_exporter的数据
- job_name: "my_computer"
static_configs:
- targets: ["localhost:9100"]
启动prometheus
- 执行启动命令,并且指定刚才创建的配置文件
shell
./prometheus --config.file=config.yml
- 启动时控制台输出如下
shell
~/temp/202410/04/prometheus-2.54.1 ./prometheus --config.file=config.yml
ts=2024-10-04T12:21:47.809Z caller=main.go:601 level=info msg="No time or size retention was set so using the default time retention" duration=15d
ts=2024-10-04T12:21:47.809Z caller=main.go:645 level=info msg="Starting Prometheus Server" mode=server version="(version=2.54.1, branch=non-git, revision=non-git)"
ts=2024-10-04T12:21:47.809Z caller=main.go:650 level=info build_context="(go=go1.23.0, platform=darwin/arm64, user=will@willdeAir, date=20241004-10:01:51, tags=netgo,builtinassets,stringlabels)"
ts=2024-10-04T12:21:47.809Z caller=main.go:651 level=info host_details=(darwin)
ts=2024-10-04T12:21:47.809Z caller=main.go:652 level=info fd_limits="(soft=10240, hard=unlimited)"
ts=2024-10-04T12:21:47.809Z caller=main.go:653 level=info vm_limits="(soft=unlimited, hard=unlimited)"
ts=2024-10-04T12:21:47.811Z caller=web.go:571 level=info component=web msg="Start listening for connections" address=0.0.0.0:9090
ts=2024-10-04T12:21:47.812Z caller=main.go:1160 level=info msg="Starting TSDB ..."
ts=2024-10-04T12:21:47.813Z caller=tls_config.go:313 level=info component=web msg="Listening on" address=[::]:9090
ts=2024-10-04T12:21:47.813Z caller=tls_config.go:316 level=info component=web msg="TLS is disabled." http2=false address=[::]:9090
ts=2024-10-04T12:21:47.814Z caller=head.go:626 level=info component=tsdb msg="Replaying on-disk memory mappable chunks if any"
ts=2024-10-04T12:21:47.814Z caller=head.go:713 level=info component=tsdb msg="On-disk memory mappable chunks replay completed" duration=24.958µs
ts=2024-10-04T12:21:47.814Z caller=head.go:721 level=info component=tsdb msg="Replaying WAL, this may take a while"
ts=2024-10-04T12:21:47.815Z caller=head.go:793 level=info component=tsdb msg="WAL segment loaded" segment=0 maxSegment=0
ts=2024-10-04T12:21:47.815Z caller=head.go:830 level=info component=tsdb msg="WAL replay completed" checkpoint_replay_duration=42.209µs wal_replay_duration=547.125µs wbl_replay_duration=42ns chunk_snapshot_load_duration=0s mmap_chunk_replay_duration=24.958µs total_replay_duration=628.75µs
ts=2024-10-04T12:21:47.816Z caller=main.go:1181 level=info fs_type=1a
ts=2024-10-04T12:21:47.816Z caller=main.go:1184 level=info msg="TSDB started"
ts=2024-10-04T12:21:47.816Z caller=main.go:1367 level=info msg="Loading configuration file" filename=config.yml
ts=2024-10-04T12:21:47.830Z caller=main.go:1404 level=info msg="updated GOGC" old=100 new=75
ts=2024-10-04T12:21:47.830Z caller=main.go:1415 level=info msg="Completed loading of configuration file" filename=config.yml totalDuration=13.984667ms db_storage=459ns remote_storage=750ns web_handler=292ns query_engine=375ns scrape=13.677291ms scrape_sd=44.25µs notify=28µs notify_sd=3.417µs rules=1.458µs tracing=12.166µs
ts=2024-10-04T12:21:47.830Z caller=main.go:1145 level=info msg="Server is ready to receive web requests."
ts=2024-10-04T12:21:47.830Z caller=manager.go:164 level=info component="rule manager" msg="Starting rule manager..."
验证prometheus
- 浏览器打开http://localhost:9090/
- 随意选一个指标查看,这里用的是node_load1,含义是系统一分钟内的负载
部署grafana
- 其实此刻prometheus的验证已经完成了,但如果想了解更多prometheus指标和PromQL的信息,可以部署grafana,然后去查看一些经典视图的详情
- 先去下载Grafana,在下载页面可以看到各下载链接,需要根据自己电脑的实情下载
- 下载完毕后解压,得到新文件夹grafana-v11.1.7,进入后执行以下命令即可启动grafana
shell
./bin/grafana server
- 启动成功后控制台信息
shell
INFO [10-04|21:24:36] HTTP Server Listen logger=http.server address=[::]:3000 protocol=http subUrl= socket=
ERROR[10-04|21:24:36] Could not get process start time, could not read "/proc": stat /proc: no such file or directory logger=grafana-apiserver
INFO [10-04|21:24:36] Adding GroupVersion playlist.grafana.app v0alpha1 to ResourceManager logger=grafana-apiserver
INFO [10-04|21:24:36] Adding GroupVersion featuretoggle.grafana.app v0alpha1 to ResourceManager logger=grafana-apiserver
INFO [10-04|21:24:39] Update check succeeded logger=plugins.update.checker duration=2.81132025s
INFO [10-04|21:24:39] Update check succeeded logger=grafana.update.checker duration=2.811167917s
INFO [10-04|21:25:11] Request Completed logger=context userId=0 orgId=0 uname= method=GET path=/ status=302 remote_addr=[::1] time_ms=1 duration=1.309584ms size=29 referer= handler=/ status_source=server
INFO [10-04|21:25:21] Usage stats are ready to report logger=infra.usagestats
- 现在grafana已经启动,可以在浏览器打开了,地址:http://localhost:3000/
- 登录页面如下,账号密码都是admin
- 登录后需要配置数据源,这样才能展示prometheus的数据
- 选择Prometheus
- 在配置页面只需填写prometheus地址
- 然后点击底部的Save & test按钮
- 这样就配置好了数据源,接下来去看看有哪些优秀的dashboard值得学习
- 打开网页https://grafana.com/grafana/dashboards/
- 因为现在只有node_exporter的指标,所以要做一下过滤
- 这里选择第一个Node Exporter Full,进入详情页后点击下图黄色剪头的按钮,得到这个视图的ID
- 拿到了视图ID,接下来可以去grafana导入了,操作如下,箭头3指向的1860就是视图ID
- 接下来的页面中,选择prometheus数据源,然后导入
- 导入成功后,可以看到机器的各项指标
- 需要注意的是内存指标无法展示,这是因为mac电脑的内存指标与视图中用的不同所致
- 选一个视图细看,这里选择了网络监控Network Traffic
- 如下图操作,进入编辑页面,可以看到更多详情
- 在编辑页面,可以看到grafana视图是如何使用prometheus指标的,后续我们的开发和配置都能从这里获取参考信息
- 把上面的表达式放到prometheus页面也能展现相同效果
- 至此,整个编译和验证过程已经完成,如果您也在编译和使用prometheus,希望本文能给您一些参考