【运维监控】influxdb 2.0+telegraf 监控tomcat 8.5运行情况(2)

关于java应用的监控本系列有文章如下:
【运维监控】influxdb 2.0+telegraf 监控tomcat 8.5运行情况
【运维监控】influxdb 2.0+grafana 监控java 虚拟机以及方法耗时情况
【运维监控】Prometheus+grafana监控tomcat运行情况
【运维监控】Prometheus+grafana监控spring boot 3运行情况

  • 本示例是通过telegraf拉取tomcat的监控指标数据到influxdb中,利用influxdb的dashboard模板来监控tomcat的运行情况。
  • 本示例使用到的组件均是最新的,下文中会有具体版本说明,linux环境是centos。
  • 本示例分为四个部分,即influxdb、telegra、tomcat的部署和三者集成的监控tomcat。
  • 本文旨在说明三者如何使用,不涉及各自组件的介绍,如果需要使用到本文的,肯定都有了解。

说明:本示例仅仅是为了展示三者结合使用,故没有考虑集群部署以及实际环境的使用,都部署在server4上,实际的使用则没有这样的要求。telegraf需要收集tomcat的运行指标,则需要一起部署。

该文章太长,故分成2个部分
【运维监控】influxdb 2.0+telegraf 监控tomcat 8.5运行情况(1)
【运维监控】influxdb 2.0+telegraf 监控tomcat 8.5运行情况(2)
【运维监控】influxdb 2.0+telegraf 监控tomcat 8.5运行情况(完整版)

四、influxdb集成telegraf监控tomcat的运行情况

1、创建dashboard方式介绍

登录http://server4:8086/后选择创建dashboard,如下图所示。

这里创建dashboard有三种方式,即直接创建、导入和添加模板。直接创建就是自己添加cell,自己根据监控的指标进行组织数据和布局;导入模板就是上传json文件或粘贴文件;添加模板则是根据influxdb提供的模板进行添加,本文着重介绍的内容。三种方式的截图依次如下。


2、添加dashboard的模板

添加模板步骤有2步,即选择需要的模板,然后添加URL安装。

在上图中浏览模板社区按钮对应的地址如下:https://github.com/influxdata/community-templates#templates

该链接打开后,包含开源的所有模板,如下图所示。

本示例选择tomcat,如下图所示

点击"tomcat dashboard"对应的链接https://github.com/influxdata/community-templates/tree/master/tomcat后进入下面页面

复制上图红框中对应的链接https://raw.githubusercontent.com/influxdata/community-templates/master/tomcat/tomcat.yml到influxdb中,并填入下图所示的位置

点击"lookup template",在弹出框中

点击安装,安装完成后,如下图所示。

由上图可以看出模板创建成后,会同时创建名称为tomcat的bucket以及获取指标数据的方式telegraf。

  • 以下说明仅供参考,如果按照本文操作顺序,则不需要关注。

导入的dashboard,如果需要bucket和telegraf的,默认是创建好的,但需要自己手动针对bucket添加数据,即setdata,根据需要设置数据。如果是默认telegraf的,则需要如下图所示,创建telegraf的配置文件,最终会生成token和链接,在对应需要监控的机器上执行即可。



3、配置telegraf

在下面页面中找到模板创建的bucket名称,点击add data,如下图所示。

由于模板是通过telegraf拉取数据的,故我们同样需要选择telegraf,如下图所示。

在下面界面选择需要收集的应用,比如本示例的tomcat,如下页面所示。

进入下面页面进行信息配置,比如(此处是最基本的配置,其他配置不再赘述)

influxdb的地址、

org name(就是influxdb初始化填写的那个,也可以通过页面修改或创建新的,视情况而定)、

bucket(此处需要填写和模板对应的bucket名称,不能变化,模板对应的bucket没地方可修改)、

tomcat的指标信息收集地址(参考tomcat配置修改中验证的地址)、

tomcat指标信息访问的用户和密码(参考tomcat配置信息中的users.xml中的配置)

具体如下图所示。

该配置的完整信息如下:

bash 复制代码
# Configuration for telegraf agent
[agent]
  ## Default data collection interval for all inputs
  interval = "10s"
  ## Rounds collection interval to 'interval'
  ## ie, if interval="10s" then always collect on :00, :10, :20, etc.
  round_interval = true

  ## Telegraf will send metrics to outputs in batches of at most
  ## metric_batch_size metrics.
  ## This controls the size of writes that Telegraf sends to output plugins.
  metric_batch_size = 1000

  ## Maximum number of unwritten metrics per output.  Increasing this value
  ## allows for longer periods of output downtime without dropping metrics at the
  ## cost of higher maximum memory usage.
  metric_buffer_limit = 10000

  ## Collection jitter is used to jitter the collection by a random amount.
  ## Each plugin will sleep for a random time within jitter before collecting.
  ## This can be used to avoid many plugins querying things like sysfs at the
  ## same time, which can have a measurable effect on the system.
  collection_jitter = "0s"

  ## Default flushing interval for all outputs. Maximum flush_interval will be
  ## flush_interval + flush_jitter
  flush_interval = "10s"
  ## Jitter the flush interval by a random amount. This is primarily to avoid
  ## large write spikes for users running a large number of telegraf instances.
  ## ie, a jitter of 5s and interval 10s means flushes will happen every 10-15s
  flush_jitter = "0s"

  ## By default or when set to "0s", precision will be set to the same
  ## timestamp order as the collection interval, with the maximum being 1s.
  ##   ie, when interval = "10s", precision will be "1s"
  ##       when interval = "250ms", precision will be "1ms"
  ## Precision will NOT be used for service inputs. It is up to each individual
  ## service input to set the timestamp at the appropriate precision.
  ## Valid time units are "ns", "us" (or "µs"), "ms", "s".
  precision = ""

  ## Log at debug level.
  # debug = false
  ## Log only error level messages.
  # quiet = false

  ## Log target controls the destination for logs and can be one of "file",
  ## "stderr" or, on Windows, "eventlog".  When set to "file", the output file
  ## is determined by the "logfile" setting.
  # logtarget = "file"

  ## Name of the file to be logged to when using the "file" logtarget.  If set to
  ## the empty string then logs are written to stderr.
  # logfile = ""

  ## The logfile will be rotated after the time interval specified.  When set
  ## to 0 no time based rotation is performed.  Logs are rotated only when
  ## written to, if there is no log activity rotation may be delayed.
  # logfile_rotation_interval = "0d"

  ## The logfile will be rotated when it becomes larger than the specified
  ## size.  When set to 0 no size based rotation is performed.
  # logfile_rotation_max_size = "0MB"

  ## Maximum number of rotated archives to keep, any older logs are deleted.
  ## If set to -1, no archives are removed.
  # logfile_rotation_max_archives = 5

  ## Pick a timezone to use when logging or type 'local' for local time.
  ## Example: America/Chicago
  # log_with_timezone = ""

  ## Override default hostname, if empty use os.Hostname()
  hostname = ""
  ## If set to true, do no set the "host" tag in the telegraf agent.
  omit_hostname = false
[[outputs.influxdb_v2]]
  ## The URLs of the InfluxDB cluster nodes.
  ##
  ## Multiple URLs can be specified for a single cluster, only ONE of the
  ## urls will be written to each interval.
  ##   ex: urls = ["https://us-west-2-1.aws.cloud2.influxdata.com"]
  urls = ["http://192.168.10.43:8086"]

  ## Token for authentication.
  token = "$INFLUX_TOKEN"

  ## Organization is the name of the organization you wish to write to; must exist.
  organization = "alanchan_win"

  ## Destination bucket to write into.
  bucket = "tomcat"

  ## The value of this tag will be used to determine the bucket.  If this
  ## tag is not set the 'bucket' option is used as the default.
  # bucket_tag = ""

  ## If true, the bucket tag will not be added to the metric.
  # exclude_bucket_tag = false

  ## Timeout for HTTP messages.
  # timeout = "5s"

  ## Additional HTTP headers
  # http_headers = {"X-Special-Header" = "Special-Value"}

  ## HTTP Proxy override, if unset values the standard proxy environment
  ## variables are consulted to determine which proxy, if any, should be used.
  # http_proxy = "http://corporate.proxy:3128"

  ## HTTP User-Agent
  # user_agent = "telegraf"

  ## Content-Encoding for write request body, can be set to "gzip" to
  ## compress body or "identity" to apply no encoding.
  # content_encoding = "gzip"

  ## Enable or disable uint support for writing uints influxdb 2.0.
  # influx_uint_support = false

  ## Optional TLS Config for use on HTTP connections.
  # tls_ca = "/etc/telegraf/ca.pem"
  # tls_cert = "/etc/telegraf/cert.pem"
  # tls_key = "/etc/telegraf/key.pem"
  ## Use TLS but skip chain & host verification
  # insecure_skip_verify = false
# Gather metrics from the Tomcat server status page.
[[inputs.tomcat]]
  ## URL of the Tomcat server status
  url = "http://server4:8080/manager/status/all?XML=true"

  ## HTTP Basic Auth Credentials
  username = "tomcat"
  password = "tomcat"

  ## Request timeout
  # timeout = "5s"

  ## Optional TLS Config
  # tls_ca = "/etc/telegraf/ca.pem"
  # tls_cert = "/etc/telegraf/cert.pem"
  # tls_key = "/etc/telegraf/key.pem"
  ## Use TLS but skip chain & host verification
  # insecure_skip_verify = false

完成以上配置保存即可。

4、获取token以及启动telegraf

完成上面的配置后,可以在下面页面获取token以及启动telegraf对应的链接地址,点击如下图中的"Setup Instructions"点击,如下图所示。

在弹出页面中生成对应的token和telegraf链接,如下图所示。

5、启动telegraf

在部署tomcat的机器上(server4),启动telegraf,操作命令如下

bash 复制代码
[alanchan@server4 ~]$ export INFLUX_TOKEN=HqBG1S4vRgOvwkBFsYn0Rj4gK3DkYxrUOnozy9PCBqAEG3f_1qHva0suA_ScSYfq-a4U8joQY9UFXbGzkwb1mA==

[alanchan@server4 bin]$ pwd
/usr/local/bigdata/telegraf-1.31.3/usr/bin
[alanchan@server4 bin]$ ll
total 239084
-rwxr-xr-x 1 alanchan root 244813976 Aug 12 22:55 telegraf
[alanchan@server4 bin]$  telegraf --config http://server4:8086/api/v2/telegrafs/0d8f726127609000

6、验证influxdb的dashboard

在http://server4:8086中点击dashboard对应的Apache Tomcat对应的dashboard即可,如下图所示。由于作者是运行比较久的情况,所以数据较多,读者刚部署完成,数据量应该没有那么多。当然,读者也可以通过数据探索查看是否将数据拉取过来了。

以上,完成了通过telegraf拉取tomcat的监控指标数据到influxdb中,利用influxdb的dashboard模板来监控tomcat的运行情况。

相关推荐
人工智能训练2 小时前
OpenEnler等Linux系统中安装git工具的方法
linux·运维·服务器·git·vscode·python·ubuntu
Wang15302 小时前
jdk内存配置优化
java·计算机网络
0和1的舞者2 小时前
Spring AOP详解(一)
java·开发语言·前端·spring·aop·面向切面
Wang15302 小时前
Java多线程死锁排查
java·计算机网络
QT 小鲜肉2 小时前
【Linux命令大全】001.文件管理之which命令(实操篇)
linux·运维·服务器·前端·chrome·笔记
小小星球之旅3 小时前
CompletableFuture学习
java·开发语言·学习
jiayong233 小时前
知识库概念与核心价值01
java·人工智能·spring·知识库
皮皮林5513 小时前
告别 OOM:EasyExcel 百万数据导出最佳实践(附开箱即用增强工具类)
java
fantasy5_54 小时前
Linux 动态进度条实战:从零掌握开发工具与核心原理
linux·运维·服务器
weixin_462446234 小时前
exo + tinygrad:Linux 节点设备能力自动探测(NVIDIA / AMD / CPU 安全兜底)
linux·运维·python·安全