构建Docker容器监控系统(2)(Cadvisor +Prometheus+Grafana)

Cadvisor产品简介

Cadvisor是Google开源的一款用于展示和分析容器运行状态的可视化工具。通过在主机上运行Cadvisor用户可以轻松的获取到当前主机上容器的运行统计信息，并以图表的形式向用户展示。

接着上一篇来继续

部署Ca dvisor

被监控主机上部署Cadvisor容器

清空原来的

[root@agent ~]# docker rm -f $(docker ps -aq)

c78b7f80fd41

a76c56a3155b

14c0398f35a2

a0010d5c535f
$root@agent \~\]# docker run -d \\ \> --volume=/:/rootfs:ro \\ \> --volume=/var/run:/var/run:ro \\ \> --volume=/sys:/sys:ro \\ \> --volume=/var/lib/docker/:/var/lib/docker:ro \\ \> --volume=/dev/disk/:/dev/disk:ro \\ \> --publish=8080:8080 \\ \> --detach=true \\ \> --name=cadvisor \\ \> google/cadvisor:latest fbd537636358169b4bcbce652b94211b06c4c7aee41362ceeb456004510b7e82$

访问cAdvisor页面

访问http://192.168.50.50:8080 cAdvisor页面可以看到收集到的数据

Prometheus 产品简介

Prometheus是一个最初在SoundCloud上构建的开源系统监视和警报工具包。自2012年成立以来，很多公司和组织都采用了Prometheus，该项目拥有非常活跃的开发者和用户社区。它现在是一个独立的开源项目，可以独立于任何公司进行维护。为了强调这一点，并阐明项目的治理结构，Prometheus于2016年加入Cloud Native Computing Foundation（云原生基金会），作为继Kubernetes之后的第二个托管项目。

Prometheus的主要特征有：

多维度数据模型-由指标键值对标识的时间序列数据组成
PromQL，一种灵活的查询语言
不依赖分布式存储; 单个服务器节点是自治的
以HTTP方式，通过pull模型拉取时间序列数据
支持通过中间网关推送时间序列数据
通过服务发现或者静态配置，来发现目标服务对象
支持多种多样的图表和界面展示

部署 Prometheus

root@agent \~\]# docker pull prom/prometheus > Using default tag: latest > > \*latest: Pulling from prom/prometheus > > 3cb635b06aa2: Pull complete > > 34f699df6fe0: Pull complete > > 33d6c9635e0f: Pull complete > > f2af7323bed8: Pull complete > > c16675a6a294: Pull complete > > 827843f6afe6: Pull complete > > 3d272942eeaf: Pull complete > > 7e785cfa34da: Pull complete > > 05e324559e3b: Pull complete > > 170620261a59: Pull complete > > ec35f5996032: Pull complete > > 5509173eb708: Pull complete > > Digest: sha256:cb9817249c346d6cfadebe383ed3b3cd4c540f623db40c4ca00da2ada45259bb > > Status: Downloaded newer image for prom/prometheus:latest > > docker.io/prom/prometheus:latest ### 配置prometheus.yml **一定注意格式很容易出错** [root@agent ~]# vim /tmp/prometheus.yml # my global config global: scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute. evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.# # scrape_timeout is set to the global default (10s). # Alertmanager configuration alerting: alertmanagers: - static_configs: - targets: # - alertmanager:9093 # Load rules once and periodically evaluate them according to the global 'evaluation_interval'. rule_files: # - "first_rules.yml" # - "second_rules.yml" # A scrape configuration containing exactly one endpoint to scrape: # Here it's Prometheus itself. scrape_configs: # The job name is added as a label `job=` to any timeseries scraped from this config. - job_name: 'prometheus' # metrics_path defaults to '/metrics' # scheme defaults to 'http'. static_configs: - targets: ['localhost:9090'] - job_name: 'docker' ##定义一个叫docker的组 static_configs: - targets: ['192.168.50.50:8080'] ##填写一个或多个cadvisor的主机地址用逗号隔开 ### 运行容器 > \[root@agent \~\]# docker run -d \\ > > \> --name=prometheus -p 9090:9090 \\ > > \> -v /tmp/prometheus.yml:/etc/prometheus/prometheus.yml \\ > > \> -v /etc/localtime:/etc/localtime \\ > > \> prom/prometheus > > a8d8416ff184232a062a71fa4ee458c904b74f6f7b86313539708fe435bd4dd1 查看有没有启动 > \[root@agent \~\]# docker ps -a > > CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES > > a8d8416ff184 prom/prometheus "/bin/prometheus --c..." 2 minutes ago Up 2 seconds 0.0.0.0:9090-\>9090/tcp, :::9090-\>9090/tcp prometheus > > 7c5c6cae02da google/cadvisor:latest "/usr/bin/cadvisor -..." 3 minutes ago Up 3 minutes 0.0.0.0:8080-\>8080/tcp, :::8080-\>8080/tcp cadvisor #### ******访问prometheus页面****** [http://192.168.50.50:9090](http://192.168/200.111:9090 "http://192.168.50.50:9090") ![](https://file.jishuzhan.net/article/1689096702805413889/5b5351357c554b5aa33e563586aa3bae.png) 看到docker组状态up为正常 ![](https://file.jishuzhan.net/article/1689096702805413889/59c12d77588944b89e028616a0ab36ad.png) 查询都可以查 ![](https://file.jishuzhan.net/article/1689096702805413889/556a1b06a8844c27b0796d21e3d5f5ce.png) ![](https://file.jishuzhan.net/article/1689096702805413889/b30bb34e39f742fd885fb05715cc8694.png) # ****部署**** ****Granfana**** > \[root@agent \~\]# docker run -d \\ > > \> --name=grafana \\ > > \> -p 3000:3000 \\ > > \> grafana/grafana > > 91f8dea9a3970f374e521eeb9203fab24e9ef766b8f95bb0672ea1706daa2e7d > > \[root@agent \~\]# docker run --name=nginx -d -p 80:80 nginx > > accb1ec5c8c9f711ba8d023474746beb32c041929b934029d41248c7c81c64d8 访问http://192.168.50.50:3000默认账户admin 密码 admin首次登陆需要修改密码 ![](https://file.jishuzhan.net/article/1689096702805413889/f0a1f6fee48d4122a387d95d479ef4d8.png) ![](https://file.jishuzhan.net/article/1689096702805413889/a82e081b6a2641149fad5dbef84026de.png) ## 配置数据源![](https://file.jishuzhan.net/article/1689096702805413889/e565ee2f441b4485b5dd31a81b5fc010.png) ![](https://file.jishuzhan.net/article/1689096702805413889/02b5991be749406a8ab1a0597cdd9f97.png) ![](https://file.jishuzhan.net/article/1689096702805413889/feb63daeb3de43caa08575cea318426d.png) ![](https://file.jishuzhan.net/article/1689096702805413889/f608164431564ee7a4af61e96418da70.png) ### 导入模板 ![](https://file.jishuzhan.net/article/1689096702805413889/8e40362a22c248d9aff22f3fc6340386.png) ![](https://file.jishuzhan.net/article/1689096702805413889/f500c9ea3ee2464eaa160abc1b613e26.png) 选择对应的数据源，点击导入，就可以看到被监控主机的数据![](https://file.jishuzhan.net/article/1689096702805413889/3354d6db5a6a4389863843f5d43109ab.png) ### 准备测试容器 \[root@agent \~\]# docker run --name=nginx -d -p 80:80 nginx accb1ec5c8c9f711ba8d023474746beb32c041929b934029d41248c7c81c64d8 可以看到成功了 右上角保存 ![](https://file.jishuzhan.net/article/1689096702805413889/085229102612418392bc8fea1f4aac35.png) ![](https://file.jishuzhan.net/article/1689096702805413889/cc1ef1d174d048e4a433334d18ee5018.png) 到此Cadvisor +Prometheus+Grafana基本架构部署完毕