Spring Boot Admin终极监控方案:从零搭建企业级微服务监控平台,含高可用集群配置

第10天:服务监控!Spring Boot Admin实时监控,老板再也不用担心服务挂了我还不知道😱

一、监控的痛点:服务挂了,我是最后一个知道的!

零基础全栈开发Java微服务版本实战-后端-前端-运维-实战企业级三个实战项目

资源获取:关注公众号: 小坏说Java ,获取本文所有示例代码、配置模板及导出工具。

真实场景:

凌晨3点,老板电话来了

  • 老板:"系统挂了你知道吗?!"
  • 你:"啊?我看看...(睡眼惺忪)"
  • 老板:"用户都投诉半小时了!"
  • 你:"监控没告警啊..."
  • 老板:"要你有啥用?!"

有了Spring Boot Admin之后

  • 你的手机先收到告警
  • 你已经在处理了
  • 老板打电话时,你已经修复了
  • 你:"老板放心,问题已解决,用户无感知~"
  • 老板:"👍"

二、Spring Boot Admin:微服务的"贴身保镖"👮

传统监控 vs Spring Boot Admin

传统监控 Spring Boot Admin
看日志找问题 可视化一目了然
手动查进程 自动发现服务
不知道内存用了多少 实时图表监控
重启了才知道 自动告警通知
每个服务单独看 集中管理所有服务

Spring Boot Admin能干啥?

  1. 健康检查:心跳检测,服务活着没
  2. 性能监控:CPU、内存、线程池
  3. 实时日志:不用登录服务器
  4. 环境配置:一眼看到所有配置
  5. 指标收集:JVM、数据库、缓存
  6. 告警通知:微信、钉钉、邮件
  7. 安全管理:权限控制,谁都能看?
  8. 批量操作:一键重启、刷新配置

三、搭建Spring Boot Admin服务端(监控中心)

零基础全栈开发Java微服务版本实战-后端-前端-运维-实战企业级三个实战项目

资源获取:关注公众号: 小坏说Java ,获取本文所有示例代码、配置模板及导出工具。

步骤1:创建admin-server项目

xml 复制代码
<!-- Spring Boot Admin Server -->
<dependency>
    <groupId>de.codecentric</groupId>
    <artifactId>spring-boot-admin-starter-server</artifactId>
    <version>2.7.10</version>
</dependency>

<!-- Web支持 -->
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-web</artifactId>
</dependency>

<!-- 安全控制(生产环境必加) -->
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-security</artifactId>
</dependency>

<!-- 邮件通知 -->
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-mail</artifactId>
</dependency>

<!-- 集成Nacos(服务发现) -->
<dependency>
    <groupId>com.alibaba.cloud</groupId>
    <artifactId>spring-cloud-starter-alibaba-nacos-discovery</artifactId>
</dependency>

步骤2:配置文件(application.yml)

yaml 复制代码
server:
  port: 8888  # Admin Server端口

spring:
  application:
    name: admin-server
  
  # 安全配置(生产环境必须!)
  security:
    user:
      name: admin      # 用户名
      password: admin123  # 密码
      roles: ADMIN
  
  # 邮件通知配置
  mail:
    host: smtp.qq.com
    port: 587
    username: your-email@qq.com
    password: your-auth-code  # 授权码,不是密码!
    properties:
      mail:
        smtp:
          auth: true
          starttls:
            enable: true
            required: true
  
  # Nacos服务发现
  cloud:
    nacos:
      discovery:
        server-addr: localhost:8848

# Spring Boot Admin配置
spring:
  boot:
    admin:
      # 通知配置
      notify:
        mail:
          enabled: true  # 开启邮件通知
          to: your-email@qq.com, boss@company.com  # 收件人
          from: your-email@qq.com  # 发件人
          
        # 钉钉通知(更实用)
        dingtalk:
          enabled: true
          webhook-url: https://oapi.dingtalk.com/robot/send?access_token=xxx
          secret: xxx
          message: "服务告警:#{instance.registration.name} (#{instance.id}) 状态:#{event.statusInfo.status} 详情:#{event.statusInfo.details}"
        
        # 企业微信通知
        wechat:
          enabled: true
          webhook-url: https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key=xxx
          
      # 实例配置
      instance:
        auth:
          # 客户端认证(如果客户端也开启了安全)
          enabled: false
          username: client
          password: client123
        
        # 元数据
        metadata:
          user:
            name: developer
            email: dev@company.com
          
        # 标题显示
        title: ${spring.application.name}
        prefer-ip: true  # 显示IP而不是主机名
        
      # UI配置
      ui:
        title: Spring Boot Admin监控中心
        brand: "🚀 微服务监控平台"
        favicon: assets/img/favicon.png
        login-icon: assets/img/login-icon.png
        # 自定义主题
        theme:
          primary-color: "#1890ff"
          secondary-color: "#f5222d"
        
      # 监控配置
      monitor:
        # 默认监控端点
        default-timeout: 10000
        # 自定义监控指标
        status:
          l1m: "http://localhost:${server.port}/actuator/metrics/system.cpu.usage?tag=alpha:l1m"
          
      # 历史数据保留
      history:
        retention-time: 24h  # 保留24小时

步骤3:启动类

java 复制代码
@SpringBootApplication
@EnableAdminServer  // 关键注解!启用Admin Server
@EnableDiscoveryClient  // 启用服务发现
public class AdminServerApplication {
    public static void main(String[] args) {
        SpringApplication.run(AdminServerApplication.class, args);
    }
}

步骤4:安全配置(重要!)

零基础全栈开发Java微服务版本实战-后端-前端-运维-实战企业级三个实战项目

资源获取:关注公众号: 小坏说Java ,获取本文所有示例代码、配置模板及导出工具。

java 复制代码
@Configuration
@EnableWebSecurity
public class SecurityConfig extends WebSecurityConfigurerAdapter {
    
    @Override
    protected void configure(HttpSecurity http) throws Exception {
        // 禁用CSRF(内部管理用,安全性要求不高可以关)
        http.csrf().disable();
        
        // 授权规则
        http.authorizeRequests()
            // 静态资源允许访问
            .antMatchers("/assets/**", "/login", "/actuator/**").permitAll()
            // 其他所有请求需要认证
            .anyRequest().authenticated()
            .and()
            // 表单登录
            .formLogin()
            .loginPage("/login")
            .defaultSuccessUrl("/")
            .permitAll()
            .and()
            // 记住我
            .rememberMe()
            .and()
            // 登出
            .logout()
            .logoutUrl("/logout")
            .logoutSuccessUrl("/login")
            .permitAll();
    }
    
    @Override
    protected void configure(AuthenticationManagerBuilder auth) throws Exception {
        // 内存用户(生产环境用数据库)
        auth.inMemoryAuthentication()
            .withUser("admin")
            .password(passwordEncoder().encode("admin123"))
            .roles("ADMIN")
            .and()
            .withUser("viewer")
            .password(passwordEncoder().encode("viewer123"))
            .roles("VIEWER");
    }
    
    @Bean
    public PasswordEncoder passwordEncoder() {
        return new BCryptPasswordEncoder();
    }
}

访问:http://localhost:8888,用admin/admin123登录

四、改造微服务客户端(被监控的服务)

步骤1:所有微服务加依赖

xml 复制代码
<!-- Spring Boot Admin Client -->
<dependency>
    <groupId>de.codecentric</groupId>
    <artifactId>spring-boot-admin-starter-client</artifactId>
    <version>2.7.10</version>
</dependency>

<!-- Actuator(必须!) -->
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-actuator</artifactId>
</dependency>

<!-- 健康检查扩展 -->
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-data-redis</artifactId>
</dependency>
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-data-jpa</artifactId>
</dependency>

步骤2:客户端配置文件

yaml 复制代码
spring:
  boot:
    admin:
      client:
        url: http://localhost:8888  # Admin Server地址
        instance:
          name: ${spring.application.name}  # 实例名称
          service-url: http://${spring.cloud.client.ip-address}:${server.port}  # 服务URL
          metadata:
            user:
              name: developer
              email: dev@company.com
            tags:
              env: ${spring.profiles.active}
              version: 1.0.0
          # 健康检查详情(生产环境建议关闭)
          health-info:
            diskspace:
              enabled: true
            db:
              enabled: true
            redis:
              enabled: true
  
  # Actuator配置(暴露端点)
  management:
    endpoints:
      web:
        exposure:
          include: "*"  # 暴露所有端点(生产环境按需开放)
        base-path: /actuator  # 端点路径
    
    endpoint:
      health:
        show-details: always  # 显示健康详情
        probes:
          enabled: true  # Kubernetes探针
      
      # 开启所有端点
      beans:
        enabled: true
      env:
        enabled: true
      info:
        enabled: true
      metrics:
        enabled: true
      loggers:
        enabled: true
      caches:
        enabled: true
      scheduledtasks:
        enabled: true
      threaddump:
        enabled: true
      heapdump:
        enabled: true
      prometheus:
        enabled: true
    
    # 指标导出
    metrics:
      export:
        prometheus:
          enabled: true
      # 自定义指标标签
      tags:
        application: ${spring.application.name}
        instance: ${spring.cloud.client.ip-address}:${server.port}
    
    # 健康检查
    health:
      # 自定义健康检查
      diskspace:
        enabled: true
        threshold: 10MB  # 磁盘空间阈值
      db:
        enabled: true
      redis:
        enabled: true
      # 自定义健康指示器
      defaults:
        enabled: true
    
    # 跟踪
    tracing:
      sampling:
        probability: 1.0
    
    # 信息
    info:
      env:
        enabled: true
      git:
        mode: full
      java:
        enabled: true
      os:
        enabled: true
      build:
        enabled: true

步骤3:自定义健康检查

java 复制代码
@Component
public class CustomHealthIndicator extends AbstractHealthIndicator {
    
    @Autowired
    private UserService userService;
    
    @Autowired
    private RedisTemplate<String, String> redisTemplate;
    
    @Override
    protected void doHealthCheck(Health.Builder builder) throws Exception {
        // 1. 检查数据库连接
        boolean dbHealthy = checkDatabase();
        
        // 2. 检查Redis连接
        boolean redisHealthy = checkRedis();
        
        // 3. 检查业务逻辑
        boolean businessHealthy = checkBusiness();
        
        if (dbHealthy && redisHealthy && businessHealthy) {
            builder.up()
                   .withDetail("database", "连接正常")
                   .withDetail("redis", "连接正常")
                   .withDetail("business", "业务正常")
                   .withDetail("timestamp", System.currentTimeMillis());
        } else {
            builder.down()
                   .withDetail("database", dbHealthy ? "正常" : "异常")
                   .withDetail("redis", redisHealthy ? "正常" : "异常")
                   .withDetail("business", businessHealthy ? "正常" : "异常")
                   .withException(new RuntimeException("健康检查失败"));
        }
    }
    
    private boolean checkDatabase() {
        try {
            // 执行简单查询
            userService.count();
            return true;
        } catch (Exception e) {
            return false;
        }
    }
    
    private boolean checkRedis() {
        try {
            redisTemplate.opsForValue().get("health-check");
            return true;
        } catch (Exception e) {
            return false;
        }
    }
    
    private boolean checkBusiness() {
        // 检查业务逻辑,比如外部接口调用
        try {
            // 模拟业务检查
            return true;
        } catch (Exception e) {
            return false;
        }
    }
}

步骤4:自定义指标

java 复制代码
@Component
public class BusinessMetrics {
    
    // 计数器:订单创建数量
    private final Counter orderCreateCounter = 
        Metrics.counter("business.order.create.count");
    
    // 计时器:订单处理耗时
    private final Timer orderProcessTimer = 
        Timer.builder("business.order.process.time")
             .description("订单处理耗时")
             .register(Metrics.globalRegistry);
    
    // 仪表:当前活跃用户数
    private final AtomicInteger activeUsers = new AtomicInteger(0);
    
    @PostConstruct
    public void init() {
        // 注册自定义仪表
        Gauge.builder("business.user.active.count", activeUsers, AtomicInteger::get)
             .description("当前活跃用户数")
             .register(Metrics.globalRegistry);
    }
    
    public void incrementOrderCount() {
        orderCreateCounter.increment();
    }
    
    public Timer.Sample startOrderProcess() {
        return Timer.start(Metrics.globalRegistry);
    }
    
    public void endOrderProcess(Timer.Sample sample) {
        sample.stop(orderProcessTimer);
    }
    
    public void userLogin() {
        activeUsers.incrementAndGet();
    }
    
    public void userLogout() {
        activeUsers.decrementAndGet();
    }
}

五、高级监控功能

1. 实时日志查看

yaml 复制代码
# 客户端配置
management:
  endpoint:
    logfile:
      enabled: true
      external-file: ./logs/app.log  # 指定日志文件
  
logging:
  file:
    name: ./logs/app.log
  pattern:
    file: "%d{yyyy-MM-dd HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg%n"

在Admin界面:

  • 点服务实例
  • 点"Loggers"标签
  • 实时查看和修改日志级别
  • 点"Logfile"看实时日志

2. JVM监控

java 复制代码
@RestController
@RequestMapping("/monitor")
public class JvmMonitorController {
    
    @Autowired
    private MeterRegistry meterRegistry;
    
    @GetMapping("/jvm")
    public Map<String, Object> jvmInfo() {
        Map<String, Object> info = new HashMap<>();
        
        // 内存信息
        Runtime runtime = Runtime.getRuntime();
        info.put("maxMemory", runtime.maxMemory() / 1024 / 1024 + "MB");
        info.put("totalMemory", runtime.totalMemory() / 1024 / 1024 + "MB");
        info.put("freeMemory", runtime.freeMemory() / 1024 / 1024 + "MB");
        info.put("usedMemory", 
                 (runtime.totalMemory() - runtime.freeMemory()) / 1024 / 1024 + "MB");
        
        // GC信息
        List<GarbageCollectorMXBean> gcBeans = 
            ManagementFactory.getGarbageCollectorMXBeans();
        List<Map<String, Object>> gcInfos = new ArrayList<>();
        for (GarbageCollectorMXBean gc : gcBeans) {
            Map<String, Object> gcInfo = new HashMap<>();
            gcInfo.put("name", gc.getName());
            gcInfo.put("collectionCount", gc.getCollectionCount());
            gcInfo.put("collectionTime", gc.getCollectionTime() + "ms");
            gcInfos.add(gcInfo);
        }
        info.put("garbageCollectors", gcInfos);
        
        // 线程信息
        ThreadMXBean threadBean = ManagementFactory.getThreadMXBean();
        info.put("threadCount", threadBean.getThreadCount());
        info.put("peakThreadCount", threadBean.getPeakThreadCount());
        info.put("daemonThreadCount", threadBean.getDaemonThreadCount());
        
        return info;
    }
    
    @GetMapping("/metrics/{name}")
    public List<Measurement> getMetric(@PathVariable String name) {
        // 获取特定指标
        Meter meter = meterRegistry.find(name).meter();
        if (meter != null) {
            List<Measurement> measurements = new ArrayList<>();
            meter.measure().forEach(measurements::add);
            return measurements;
        }
        return Collections.emptyList();
    }
}

3. 数据库连接池监控

yaml 复制代码
# 使用Druid连接池
spring:
  datasource:
    type: com.alibaba.druid.pool.DruidDataSource
    druid:
      # 监控配置
      stat-view-servlet:
        enabled: true
        url-pattern: /druid/*
        login-username: admin
        login-password: admin123
      web-stat-filter:
        enabled: true
      filter:
        stat:
          enabled: true
          log-slow-sql: true
          slow-sql-millis: 1000
        wall:
          enabled: true
        config:
          enabled: true

在Admin界面集成Druid:

java 复制代码
@Configuration
public class DruidConfig {
    
    @Bean
    public ServletRegistrationBean<StatViewServlet> druidServlet() {
        ServletRegistrationBean<StatViewServlet> bean = 
            new ServletRegistrationBean<>(new StatViewServlet(), "/druid/*");
        
        Map<String, String> initParams = new HashMap<>();
        initParams.put("loginUsername", "admin");
        initParams.put("loginPassword", "admin123");
        initParams.put("allow", "");  // 默认允许所有
        initParams.put("deny", "192.168.1.100");  // 拒绝某个IP
        
        bean.setInitParameters(initParams);
        return bean;
    }
}

4. 自定义监控面板

java 复制代码
@Component
public class CustomUiExtension implements UiExtension {
    
    @Override
    public Map<String, ?> getAssets() {
        Map<String, String> assets = new HashMap<>();
        // 自定义CSS
        assets.put("custom.css", "/assets/css/custom.css");
        // 自定义JS
        assets.put("custom.js", "/assets/js/custom.js");
        return assets;
    }
    
    @Override
    public String getBrand() {
        return "<span class=\"navbar-brand\">🚀 我的监控平台</span>";
    }
    
    @Override
    public String getTitle() {
        return "微服务监控中心";
    }
    
    @Override
    public String getFavicon() {
        return "assets/img/favicon.ico";
    }
    
    @Override
    public Map<String, String> getLoginIcon() {
        Map<String, String> icons = new HashMap<>();
        icons.put("img", "assets/img/login-logo.png");
        icons.put("favicon", "assets/img/favicon.ico");
        return icons;
    }
    
    @Override
    public Map<String, ?> getViews() {
        Map<String, String> views = new HashMap<>();
        // 自定义视图
        views.put("custom", "/custom/view");
        return views;
    }
}

六、告警通知配置

1. 邮件告警

yaml 复制代码
spring:
  boot:
    admin:
      notify:
        mail:
          enabled: true
          # 收件人(多个用逗号分隔)
          to: admin@company.com,dev-team@company.com
          # 抄送
          cc: manager@company.com
          # 密送
          bcc: 
          # 发件人
          from: monitor@company.com
          # 模板
          template: |
            服务状态变更通知
            
            服务名称: #{instance.registration.name}
            服务ID: #{instance.id}
            服务地址: #{instance.registration.serviceUrl}
            状态: #{event.statusInfo.status}
            时间: #{timestamp?datetime}
            详情: #{event.statusInfo.details}
            
            请及时处理!

2. 钉钉告警

java 复制代码
@Configuration
public class DingTalkNotifierConfig {
    
    @Bean
    public Notifier dingTalkNotifier() {
        DingTalkNotifier notifier = new DingTalkNotifier();
        notifier.setWebhookUrl("https://oapi.dingtalk.com/robot/send?access_token=xxx");
        notifier.setSecret("xxx");  // 安全设置
        notifier.setEnabled(true);
        
        // 自定义消息
        notifier.setMessage(
            "【服务告警】\n" +
            "服务: #{instance.registration.name}\n" +
            "状态: #{event.statusInfo.status}\n" +
            "时间: #{#dates.format(timestamp, 'yyyy-MM-dd HH:mm:ss')}\n" +
            "详情: #{event.statusInfo.details}\n" +
            "👉 <a href=\"#{baseUrl}\">查看详情</a>"
        );
        
        // 只通知DOWN状态
        notifier.setStatuses(Arrays.asList(StatusInfo.STATUS_DOWN));
        
        return notifier;
    }
}

3. 企业微信告警

java 复制代码
@Bean
public Notifier wechatNotifier() {
    WeChatNotifier notifier = new WeChatNotifier();
    notifier.setWebhookUrl("https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key=xxx");
    notifier.setEnabled(true);
    
    // Markdown格式
    notifier.setMessage(
        "### 服务监控告警\n" +
        "> **服务名称**: #{instance.registration.name}\n" +
        "> **服务状态**: <font color=\"warning\">#{event.statusInfo.status}</font>\n" +
        "> **发生时间**: #{#dates.format(timestamp, 'yyyy-MM-dd HH:mm:ss')}\n" +
        "> **详情描述**: #{event.statusInfo.details}\n" +
        "#{baseUrl}"
    );
    
    return notifier;
}

4. 自定义告警规则

java 复制代码
@Component
public class CustomNotifier extends AbstractEventNotifier {
    
    @Override
    protected Mono<Void> doNotify(InstanceEvent event, Instance instance) {
        // 只处理状态变更事件
        if (event instanceof InstanceStatusChangedEvent) {
            InstanceStatusChangedEvent statusChangeEvent = 
                (InstanceStatusChangedEvent) event;
            
            // 获取状态
            String from = statusChangeEvent.getStatusInfo().getStatus();
            String to = statusChangeEvent.getStatusInfo().getStatus();
            
            // 自定义告警逻辑
            if ("DOWN".equals(to)) {
                // 服务宕机,紧急告警
                sendEmergencyAlert(instance, statusChangeEvent);
            } else if ("UP".equals(from) && "OFFLINE".equals(to)) {
                // 服务下线,一般告警
                sendNormalAlert(instance, statusChangeEvent);
            } else if ("OFFLINE".equals(from) && "UP".equals(to)) {
                // 服务恢复,通知恢复
                sendRecoveryAlert(instance, statusChangeEvent);
            }
        }
        
        return Mono.empty();
    }
    
    private void sendEmergencyAlert(Instance instance, InstanceStatusChangedEvent event) {
        // 发送短信、电话等紧急通知
        System.out.println("🚨 紧急告警!服务宕机:" + 
                          instance.getRegistration().getName());
        
        // 调用电话告警接口
        callPhoneAlert(instance, event);
        
        // 调用短信接口
        sendSmsAlert(instance, event);
    }
    
    private void sendNormalAlert(Instance instance, InstanceStatusChangedEvent event) {
        // 发送邮件、钉钉等一般通知
        System.out.println("⚠️ 服务异常:" + 
                          instance.getRegistration().getName());
    }
    
    private void sendRecoveryAlert(Instance instance, InstanceStatusChangedEvent event) {
        // 发送恢复通知
        System.out.println("✅ 服务恢复:" + 
                          instance.getRegistration().getName());
    }
}

七、生产环境部署方案

方案1:单节点部署(小型项目)

yaml 复制代码
# application-prod.yml
spring:
  boot:
    admin:
      # 开启安全
      context-path: /admin
      
      # 通知配置
      notify:
        mail:
          enabled: true
          to: ${ALERT_EMAILS}
          from: ${MAIL_FROM}
        
        dingtalk:
          enabled: true
          webhook-url: ${DINGTALK_WEBHOOK}
          secret: ${DINGTALK_SECRET}
          
        # 频率限制
        reminder:
          period: 30s  # 30秒提醒一次
          statuses: DOWN
      
      # 实例保留
      instance:
        retention-period: 7d  # 保留7天
        
      # 元数据
      metadata-keys-to-sanitize: .*password.*, .*secret.*, .*key.*
      
      # 忽略的端点(敏感信息)
      ignored-routes: /env/**, /heapdump/**, /logfile/**

方案2:集群部署(高可用)

yaml 复制代码
# 多个Admin Server实例
spring:
  boot:
    admin:
      # 使用数据库存储实例信息
      store:
        type: redis  # 或jdbc
        
      # Redis配置
      redis:
        host: ${REDIS_HOST}
        port: ${REDIS_PORT}
        password: ${REDIS_PASSWORD}
        
      # 实例同步
      instance:
        # 使用服务发现
        discovery:
          enabled: true
          services: user-service,order-service,product-service
        
        # 服务URL覆盖
        service-url:
          override: http://${spring.cloud.client.ip-address}:${server.port}

方案3:结合Prometheus + Grafana

yaml 复制代码
# 启用Prometheus指标
management:
  endpoints:
    web:
      exposure:
        include: health,info,metrics,prometheus
    
  metrics:
    export:
      prometheus:
        enabled: true
    
    # 标签
    tags:
      application: ${spring.application.name}
      environment: ${spring.profiles.active}
      
    # 分布统计
    distribution:
      percentiles-histogram:
        http.server.requests: true
      slo:
        http.server.requests: 100ms, 200ms, 500ms

Grafana仪表盘配置:

json 复制代码
{
  "dashboard": {
    "title": "微服务监控",
    "panels": [
      {
        "title": "CPU使用率",
        "targets": [
          {
            "expr": "system_cpu_usage{application=\"$application\"}",
            "legendFormat": "{{instance}}"
          }
        ]
      },
      {
        "title": "内存使用",
        "targets": [
          {
            "expr": "jvm_memory_used_bytes{application=\"$application\", area=\"heap\"}",
            "legendFormat": "堆内存"
          }
        ]
      }
    ]
  }
}

八、监控最佳实践

1. 分级监控策略

yaml 复制代码
# 根据环境配置不同的监控级别
spring:
  profiles: prod
  
  boot:
    admin:
      notify:
        # 生产环境:紧急告警
        mail:
          enabled: true
          to: oncall-team@company.com
          # 只通知DOWN状态
          statuses: DOWN
          # 5分钟内不重复告警
          throttle-period: 5m
        
        # 测试环境:所有变更都通知
        slack:
          enabled: true
          webhook-url: ${SLACK_WEBHOOK}
          # 通知所有状态变更
          statuses: UP, DOWN, OFFLINE, UNKNOWN

2. 关键指标监控

java 复制代码
@Component
public class CriticalMetricsMonitor {
    
    // 关键业务指标
    private static final Map<String, Double> THRESHOLDS = Map.of(
        "system.cpu.usage", 0.8,          // CPU使用率80%
        "jvm.memory.used", 0.9,           // 内存使用90%
        "tomcat.threads.busy", 0.8,       // 线程使用率80%
        "http.server.requests.duration", 1.0,  // 接口响应1秒
        "hikaricp.connections.active", 0.9,    // 数据库连接90%
        "redis.lettuce.command.duration", 0.1  // Redis命令100ms
    );
    
    @Scheduled(fixedDelay = 30000)  // 每30秒检查一次
    public void checkMetrics() {
        THRESHOLDS.forEach((metric, threshold) -> {
            Double value = getMetricValue(metric);
            if (value != null && value > threshold) {
                sendAlert(metric, value, threshold);
            }
        });
    }
}

3. 容量规划

java 复制代码
@Service
public class CapacityPlanningService {
    
    @Autowired
    private MeterRegistry meterRegistry;
    
    /**
     * 根据历史数据预测容量
     */
    public CapacityPrediction predictCapacity(String serviceName) {
        // 获取历史指标
        Map<String, List<Double>> history = 
            getHistoricalMetrics(serviceName, 30);  // 最近30天
        
        // 分析趋势
        double cpuGrowthRate = calculateGrowthRate(history.get("cpu"));
        double memoryGrowthRate = calculateGrowthRate(history.get("memory"));
        double requestGrowthRate = calculateGrowthRate(history.get("requests"));
        
        // 预测3个月后的需求
        LocalDate predictionDate = LocalDate.now().plusMonths(3);
        double predictedCpu = predictValue(cpuGrowthRate, predictionDate);
        double predictedMemory = predictValue(memoryGrowthRate, predictionDate);
        double predictedRequests = predictValue(requestGrowthRate, predictionDate);
        
        // 计算需要的资源
        int neededInstances = (int) Math.ceil(predictedRequests / 1000);  // 假设单实例支撑1000QPS
        double neededCpuPerInstance = predictedCpu / neededInstances;
        double neededMemoryPerInstance = predictedMemory / neededInstances;
        
        return new CapacityPrediction(
            neededInstances,
            neededCpuPerInstance,
            neededMemoryPerInstance,
            predictionDate
        );
    }
}

九、常见问题解决

1. Admin Server找不到客户端

检查

  1. 客户端spring.boot.admin.client.url配置正确
  2. 客户端暴露了/actuator端点
  3. 网络连通性
  4. 安全配置

2. 健康检查显示UNKNOWN

yaml 复制代码
# 客户端配置
management:
  endpoint:
    health:
      show-details: always
      # 添加自定义健康指示器
      group:
        custom:
          include: diskSpace, db, redis, custom

3. 监控数据不准确

java 复制代码
// 自定义HealthIndicator时,确保正确处理异常
@Override
protected void doHealthCheck(Health.Builder builder) throws Exception {
    try {
        // 检查逻辑
        boolean healthy = checkService();
        if (healthy) {
            builder.up();
        } else {
            builder.down();
        }
    } catch (Exception e) {
        builder.down(e);  // 记录异常
    }
}

4. 性能影响

yaml 复制代码
# 调整监控频率
management:
  endpoints:
    web:
      exposure:
        include: health, info, metrics  # 只暴露必要的
  
  metrics:
    export:
      # 降低采集频率
      simple:
        step: 30s  # 30秒采集一次
    
    # 关闭不重要的指标
    enable:
      jvm: true
      system: true
      tomcat: false  # 如果不是Tomcat
      logback: false
      hikaricp: true  # 数据库连接池重要

十、今儿个总结

零基础全栈开发Java微服务版本实战-后端-前端-运维-实战企业级三个实战项目

资源获取:关注公众号: 小坏说Java ,获取本文所有示例代码、配置模板及导出工具。

学会了啥?

  1. ✅ Spring Boot Admin的作用(微服务"贴身保镖")
  2. ✅ Admin Server搭建和配置
  3. ✅ 客户端集成和监控端点暴露
  4. ✅ 健康检查自定义
  5. ✅ 指标监控和自定义指标
  6. ✅ 告警通知配置(邮件、钉钉、微信)
  7. ✅ 生产环境部署方案
  8. ✅ 监控最佳实践

关键点

  1. Actuator必须:客户端要暴露端点
  2. 安全第一:Admin Server必须加权限控制
  3. 分级监控:不同环境不同策略
  4. 告警及时:多种通知渠道
  5. 性能平衡:监控不能影响业务

十一、明儿个学啥?

零基础全栈开发Java微服务版本实战-后端-前端-运维-实战企业级三个实战项目

资源获取:关注公众号: 小坏说Java ,获取本文所有示例代码、配置模板及导出工具。

明天咱学容器化部署

  • 用Docker打包微服务,一次构建处处运行
  • Docker Compose一键启动所有服务
  • 镜像优化:从1GB到100MB的瘦身秘诀
  • 生产环境Docker最佳实践

明天咱让微服务住进"集装箱",部署不再头疼!🐳



相关推荐
悟道|养家2 小时前
微服务扇出:网络往返时间的影响与优化实践(5)
网络·微服务
咕叽咕叽的汪3 小时前
Es/Kibana7.17.9中数据迁移到openSearch3.4.0【DockerDesktop模拟】
运维·spring boot·elasticsearch·docker·容器·devops
千寻技术帮3 小时前
10340_基于Springboot的游戏网站
spring boot·后端·游戏·vue·商城
WX-bisheyuange3 小时前
基于SpringBoot的诊疗预约平台
java·spring boot·后端·毕业设计
我爱娃哈哈3 小时前
SpringBoot + Canal + RabbitMQ:MySQL 数据变更实时同步到缓存与搜索系统
spring boot·rabbitmq·java-rabbitmq
麦兜*3 小时前
Spring Boot整合MyBatis-Plus实战:简化CRUD操作的最佳实践
spring boot·tomcat·mybatis
麦兜*3 小时前
Spring Boot 整合 Spring Data JPA 入门:只需注解,告别 SQL
spring boot·后端·sql
智能化咨询3 小时前
(122页PPT)数字化架构演进和治理(附下载方式)
微服务·云原生·架构
老蒋每日coding3 小时前
驾驭并发之力:Go语言构建高可用微服务完全指南
开发语言·微服务·golang