Apache SkyWalking 监控 Linux 实战

SkyWalking 从 8.4 版本开始支持监控主机,用户可以轻松从 dashboard 上检测可能的问题,例如当 CPU 使用过载、内存或磁盘空间不足或者当网络状态不健康时等。 与监控 MySQL Server 类似,SkyWalking 也是利用 Prometheus 和 OpenTelemetry 收集主机的 metrics 数据。 同时 SkyWalking 也提供了使用 InfluxDB Telegraf 通过 Telegraf receiver 接收主机的 metrics 数据,telegraf receiver 插件负责接收、处理和转换 metrics, 然后将转换后的数据发送给 SkyWalking MAL 处理。

方式一:Prometheus + OpenTelemetry,处理流程如下:

  • Prometheus Node Exporter 从主机收集 metrics 数据.
  • OpenTelemetry Collector 通过 Prometheus Receiver 从 Node Exporters 抓取 metrics 数据, 然后将 metrics 推送的到 SkyWalking OAP Server.
  • SkyWalking OAP Server 通过 MAL 引擎去分析、计算、聚合和存储,处理规则位于 /config/otel-oc-rules/vm.yaml 文件.
  • 用户可以通过 SkyWalking WebUI dashboard 查看监控数据。

方式二:通过 Telegraf receiver 具体参考官网文档的部署方式 skywalking.apache.org/docs/main/n...

下面让我们一起开始部署吧

部署 SkyWalking

之前的文章都是默认大家会部署 SkyWalking 的,这里我也提供下平时测试用的部署方式,我这里采用 docker compose 部署,主要是参考官网提供的部署配置, SkyWalking 提供的部署基础配置在代码目录中,链接地址 github.com/apache/skyw...

docker-compose.yml
yaml 复制代码
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

version: '3.8'
services:
  elasticsearch:
    image: docker.elastic.co/elasticsearch/elasticsearch-oss:${ES_VERSION}
    container_name: elasticsearch
    ports:
      - "9200:9200"
    healthcheck:
      test: [ "CMD-SHELL", "curl --silent --fail localhost:9200/_cluster/health || exit 1" ]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 10s
    environment:
      - discovery.type=single-node
      - bootstrap.memory_lock=true
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
    ulimits:
      memlock:
        soft: -1
        hard: -1

  oap:
    image: ${OAP_IMAGE}
    container_name: oap
    depends_on:
      elasticsearch:
        condition: service_healthy
    links:
      - elasticsearch
    ports:
      - "11800:11800"
      - "12800:12800"
    healthcheck:
      test: [ "CMD-SHELL", "/skywalking/bin/swctl ch" ]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 10s
    environment:
      SW_STORAGE: elasticsearch
      SW_STORAGE_ES_CLUSTER_NODES: elasticsearch:9200
      SW_HEALTH_CHECKER: default
      SW_TELEMETRY: prometheus
      JAVA_OPTS: "-Xms2048m -Xmx2048m"

  ui:
    image: ${UI_IMAGE}
    container_name: ui
    depends_on:
      oap:
        condition: service_healthy
    links:
      - oap
    ports:
      - "8080:8080"
    environment:
      SW_OAP_ADDRESS: http://oap:12800
      SW_ZIPKIN_ADDRESS: http://oap:9412
.env

用于镜像版本,这里使用 SkyWalking 9.7.0 版本,也是当前最新的版本

text 复制代码
# The docker-compose.yml file is meant to be used locally for testing only after a local build, if you want to use it
# with officially released Docker images, please modify the environment variables on your command line interface.
# i.e.:
# export OAP_IMAGE=apache/skywalking-oap-server:<tag>
# export UI_IMAGE=apache/skywalking-ui:<tag>
# docker compose up

ES_VERSION=7.4.2
OAP_IMAGE=apache/skywalking-oap-server:9.7.0
UI_IMAGE=apache/skywalking-ui:9.7.0

启动 SkyWalking

shell 复制代码
docker compose up

启动完成后,访问 http://IP:8080 就可以正常打开 dashboard 页面了,当然这个时候还看不到监控数据。

部署 Prometheus node-exporter

node-exporter 官方文档 ,下载地址 prometheus.io/download/#n... 这里下载 1.7.0 版本

shell 复制代码
wget https://github.com/prometheus/node_exporter/releases/download/v1.7.0/node_exporter-1.7.0.linux-amd64.tar.gz
tar xvfz node_exporter-1.7.0.linux-amd64.tar.gz
cd node_exporter-1.7.0.linux-amd64
./node_exporter

启动成功后访问 http://IP:9100/metrics 可以看到采集到的 metrics 信息。

部署 OpenTelemetry Collector

OpenTelemetry Collector 官方文档 这里将 otel-collector 和 skywalking 一起通过 docker compose 部署,完整配置如下:

yaml 复制代码
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

version: '3.8'
services:
  elasticsearch:
    image: docker.elastic.co/elasticsearch/elasticsearch-oss:${ES_VERSION}
    container_name: elasticsearch
    ports:
      - "9200:9200"
    healthcheck:
      test: [ "CMD-SHELL", "curl --silent --fail localhost:9200/_cluster/health || exit 1" ]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 10s
    environment:
      - discovery.type=single-node
      - bootstrap.memory_lock=true
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
    ulimits:
      memlock:
        soft: -1
        hard: -1

  oap:
    image: ${OAP_IMAGE}
    container_name: oap
    depends_on:
      elasticsearch:
        condition: service_healthy
    links:
      - elasticsearch
    ports:
      - "11800:11800"
      - "12800:12800"
    healthcheck:
      test: [ "CMD-SHELL", "/skywalking/bin/swctl ch" ]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 10s
    environment:
      SW_STORAGE: elasticsearch
      SW_STORAGE_ES_CLUSTER_NODES: elasticsearch:9200
      SW_HEALTH_CHECKER: default
      SW_TELEMETRY: prometheus
      JAVA_OPTS: "-Xms2048m -Xmx2048m"

  ui:
    image: ${UI_IMAGE}
    container_name: ui
    depends_on:
      oap:
        condition: service_healthy
    links:
      - oap
    ports:
      - "8080:8080"
    environment:
      SW_OAP_ADDRESS: http://oap:12800
      SW_ZIPKIN_ADDRESS: http://oap:9412

  otel-collector:
    image: otel/opentelemetry-collector:0.50.0
    container_name: otel-collector
    command: [ "--config=/etc/otel-collector-config.yaml" ]
    volumes:
      - /opt/docker/skywalking/otel-collector-config.yaml:/etc/otel-collector-config.yaml
    expose:
      - 55678

在 /opt/docker/skywalking 目录(也可以自定义)创建配置文件 otel-collector-config.yaml

yaml 复制代码
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

receivers:
  prometheus:
    config:
      scrape_configs:
        - job_name: "vm-monitoring" # make sure to use this in the vm.yaml to filter only VM metrics
          scrape_interval: 10s
          static_configs:
            - targets: ["192.168.56.102:9100"] # OpenTelemetry Collector 的地址,根据自己的服务器IP进行调整

processors:
  batch:

exporters:
  otlp:
    endpoint: "oap:11800" # The OAP Server address
    # The config format of OTEL version prior to 0.34.0, eg. 0.29.0, should be:
    # insecure: true
    tls:
      insecure: true
    #insecure: true
  # Exports data to the console
  logging:
    loglevel: debug

service:
  pipelines:
    metrics:
      receivers: [prometheus]
      processors: [batch]
      exporters: [otlp, logging]

docker compose 重新启动

shell 复制代码
docker compose restart

访问 SkyWalking dashboard 页面,正常情况下就可以看到监控到的数据了。

相关链接

更多精彩内容请关注公众号 geekymv,喜欢请分享给更多的朋友哦」如有问题,欢迎交流。

相关推荐
闲猫5 小时前
go orm GORM
开发语言·后端·golang
丁卯4046 小时前
Go语言中使用viper绑定结构体和yaml文件信息时,标签的使用
服务器·后端·golang
山河已无恙9 小时前
基于 DeepSeek LLM 本地知识库搭建开源方案(AnythingLLM、Cherry、Ragflow、Dify)认知
开源·知识库·deepseek
bing_1589 小时前
简单工厂模式 (Simple Factory Pattern) 在Spring Boot 中的应用
spring boot·后端·简单工厂模式
天上掉下来个程小白10 小时前
案例-14.文件上传-简介
数据库·spring boot·后端·mybatis·状态模式
AI服务老曹10 小时前
运用先进的智能算法和优化模型,进行科学合理调度的智慧园区开源了
运维·人工智能·安全·开源·音视频
gz927cool10 小时前
大模型做导师之开源项目学习(lightRAG)
学习·开源·mfc
Asthenia041211 小时前
基于Jackson注解的JSON工具封装与Redis集成实战
后端
编程星空11 小时前
css主题色修改后会多出一个css吗?css怎么定义变量?
开发语言·后端·rust
Ainnle11 小时前
企业级RAG开源项目分享:Quivr、MaxKB、Dify、FastGPT、RagFlow
人工智能·开源