第9篇:监控与运维 - 集成Actuator健康检查

前言

生产环境中,监控和运维是不可或缺的。本章将集成Spring Boot Actuator,为日志框架添加健康检查、指标监控和运行时管理功能,让框架具备企业级的可观测性。

Actuator集成架构

graph LR A[Actuator端点] --> B[LogEndpoint] A --> C[LogHealthIndicator] B --> D[日志统计信息] C --> E[健康状态检查] E --> F[监控指标暴露]

集成要点:

  • 🏥 健康检查:监控框架运行状态
  • 📊 指标收集:统计日志调用次数、性能数据
  • ⚙️ 运行时管理:动态调整日志配置
  • 🔍 故障诊断:提供调试和排错信息

LogHealthIndicator - 健康检查指标

java 复制代码
package com.simpleflow.log.springboot.actuator;

import com.simpleflow.log.context.ThreadLocalTraceHolder;
import com.simpleflow.log.processor.AnnotationConfigResolver;
import org.springframework.boot.actuate.health.Health;
import org.springframework.boot.actuate.health.HealthIndicator;
import org.springframework.stereotype.Component;

import java.lang.management.ManagementFactory;
import java.lang.management.MemoryMXBean;
import java.lang.management.ThreadMXBean;

/**
 * 日志框架健康检查指标
 * 
 * 监控框架的运行状态、性能指标和资源使用情况
 */
@Component
public class LogHealthIndicator implements HealthIndicator {
    
    private final AnnotationConfigResolver configResolver;
    private volatile long lastCheckTime = System.currentTimeMillis();
    private volatile boolean lastCheckResult = true;
    
    public LogHealthIndicator(AnnotationConfigResolver configResolver) {
        this.configResolver = configResolver;
    }
    
    @Override
    public Health health() {
        try {
            Health.Builder builder = new Health.Builder();
            
            // 检查核心组件状态
            boolean coreHealthy = checkCoreComponents();
            
            // 检查内存使用情况
            MemoryStatus memoryStatus = checkMemoryUsage();
            
            // 检查线程状态
            ThreadStatus threadStatus = checkThreadStatus();
            
            // 检查配置缓存状态
            CacheStatus cacheStatus = checkCacheStatus();
            
            // 综合判断健康状态
            boolean isHealthy = coreHealthy && 
                              memoryStatus.isHealthy() && 
                              threadStatus.isHealthy() &&
                              cacheStatus.isHealthy();
            
            if (isHealthy) {
                builder.up();
            } else {
                builder.down();
            }
            
            // 添加详细信息
            builder.withDetail("core", coreHealthy ? "UP" : "DOWN")
                   .withDetail("memory", memoryStatus)
                   .withDetail("threads", threadStatus)
                   .withDetail("cache", cacheStatus)
                   .withDetail("lastCheckTime", lastCheckTime)
                   .withDetail("uptime", getUptimeInfo());
            
            lastCheckTime = System.currentTimeMillis();
            lastCheckResult = isHealthy;
            
            return builder.build();
            
        } catch (Exception e) {
            return Health.down()
                        .withDetail("error", e.getMessage())
                        .withDetail("lastCheckTime", lastCheckTime)
                        .build();
        }
    }
    
    /**
     * 检查核心组件状态
     */
    private boolean checkCoreComponents() {
        try {
            // 检查配置解析器是否正常
            if (configResolver == null) {
                return false;
            }
            
            // 检查ThreadLocal是否可以正常工作
            ThreadLocalTraceHolder.initTrace();
            boolean hasContext = ThreadLocalTraceHolder.getCurrentTrace() != null;
            ThreadLocalTraceHolder.clearCurrentTrace();
            
            return hasContext;
            
        } catch (Exception e) {
            return false;
        }
    }
    
    /**
     * 检查内存使用情况
     */
    private MemoryStatus checkMemoryUsage() {
        MemoryMXBean memoryBean = ManagementFactory.getMemoryMXBean();
        
        long usedMemory = memoryBean.getHeapMemoryUsage().getUsed();
        long maxMemory = memoryBean.getHeapMemoryUsage().getMax();
        
        double usagePercentage = maxMemory > 0 ? (double) usedMemory / maxMemory * 100 : 0;
        
        return new MemoryStatus(usedMemory, maxMemory, usagePercentage);
    }
    
    /**
     * 检查线程状态
     */
    private ThreadStatus checkThreadStatus() {
        ThreadMXBean threadBean = ManagementFactory.getThreadMXBean();
        
        int threadCount = threadBean.getThreadCount();
        int daemonThreadCount = threadBean.getDaemonThreadCount();
        long totalStartedThreadCount = threadBean.getTotalStartedThreadCount();
        
        return new ThreadStatus(threadCount, daemonThreadCount, totalStartedThreadCount);
    }
    
    /**
     * 检查配置缓存状态
     */
    private CacheStatus checkCacheStatus() {
        try {
            String cacheStats = configResolver.getCacheStats();
            return new CacheStatus(true, cacheStats);
        } catch (Exception e) {
            return new CacheStatus(false, "缓存状态检查失败: " + e.getMessage());
        }
    }
    
    /**
     * 获取运行时间信息
     */
    private String getUptimeInfo() {
        long uptime = ManagementFactory.getRuntimeMXBean().getUptime();
        long hours = uptime / (1000 * 60 * 60);
        long minutes = (uptime % (1000 * 60 * 60)) / (1000 * 60);
        long seconds = (uptime % (1000 * 60)) / 1000;
        
        return String.format("%d小时%d分钟%d秒", hours, minutes, seconds);
    }
    
    // ========== 内部状态类 ==========
    
    public static class MemoryStatus {
        private final long usedMemory;
        private final long maxMemory;
        private final double usagePercentage;
        
        public MemoryStatus(long usedMemory, long maxMemory, double usagePercentage) {
            this.usedMemory = usedMemory;
            this.maxMemory = maxMemory;
            this.usagePercentage = usagePercentage;
        }
        
        public boolean isHealthy() {
            return usagePercentage < 90.0; // 内存使用率低于90%认为健康
        }
        
        public long getUsedMemory() { return usedMemory; }
        public long getMaxMemory() { return maxMemory; }
        public double getUsagePercentage() { return usagePercentage; }
        
        @Override
        public String toString() {
            return String.format("已使用: %dMB, 最大: %dMB, 使用率: %.2f%%", 
                usedMemory / 1024 / 1024, 
                maxMemory / 1024 / 1024, 
                usagePercentage);
        }
    }
    
    public static class ThreadStatus {
        private final int threadCount;
        private final int daemonThreadCount;
        private final long totalStartedThreadCount;
        
        public ThreadStatus(int threadCount, int daemonThreadCount, long totalStartedThreadCount) {
            this.threadCount = threadCount;
            this.daemonThreadCount = daemonThreadCount;
            this.totalStartedThreadCount = totalStartedThreadCount;
        }
        
        public boolean isHealthy() {
            return threadCount < 1000; // 线程数少于1000认为健康
        }
        
        public int getThreadCount() { return threadCount; }
        public int getDaemonThreadCount() { return daemonThreadCount; }
        public long getTotalStartedThreadCount() { return totalStartedThreadCount; }
        
        @Override
        public String toString() {
            return String.format("当前线程: %d, 守护线程: %d, 总启动线程: %d", 
                threadCount, daemonThreadCount, totalStartedThreadCount);
        }
    }
    
    public static class CacheStatus {
        private final boolean healthy;
        private final String stats;
        
        public CacheStatus(boolean healthy, String stats) {
            this.healthy = healthy;
            this.stats = stats;
        }
        
        public boolean isHealthy() { return healthy; }
        public String getStats() { return stats; }
        
        @Override
        public String toString() {
            return stats;
        }
    }
}

LogEndpoint - 自定义端点

java 复制代码
package com.simpleflow.log.springboot.actuator;

import com.simpleflow.log.config.LogConfig;
import com.simpleflow.log.processor.AnnotationConfigResolver;
import com.simpleflow.log.springboot.properties.LogProperties;
import org.springframework.boot.actuate.endpoint.annotation.Endpoint;
import org.springframework.boot.actuate.endpoint.annotation.ReadOperation;
import org.springframework.boot.actuate.endpoint.annotation.WriteOperation;
import org.springframework.stereotype.Component;

import java.time.LocalDateTime;
import java.util.HashMap;
import java.util.Map;
import java.util.concurrent.atomic.AtomicLong;

/**
 * 日志框架自定义端点
 * 
 * 提供日志统计信息、配置查看和运行时管理功能
 */
@Component
@Endpoint(id = "simpleflow-log")
public class LogEndpoint {
    
    private final LogProperties logProperties;
    private final AnnotationConfigResolver configResolver;
    private final LogConfig defaultLogConfig;
    
    // 统计信息
    private final AtomicLong totalLogCount = new AtomicLong(0);
    private final AtomicLong errorLogCount = new AtomicLong(0);
    private final AtomicLong cacheHitCount = new AtomicLong(0);
    private final AtomicLong cacheMissCount = new AtomicLong(0);
    
    private volatile LocalDateTime startTime = LocalDateTime.now();
    
    public LogEndpoint(LogProperties logProperties, 
                      AnnotationConfigResolver configResolver,
                      LogConfig defaultLogConfig) {
        this.logProperties = logProperties;
        this.configResolver = configResolver;
        this.defaultLogConfig = defaultLogConfig;
    }
    
    /**
     * 获取日志框架状态信息
     */
    @ReadOperation
    public Map<String, Object> info() {
        Map<String, Object> info = new HashMap<>();
        
        // 基本信息
        info.put("framework", "SimpleFlow Log Framework");
        info.put("version", "1.0.0");
        info.put("startTime", startTime);
        info.put("status", "RUNNING");
        
        // 配置信息
        info.put("configuration", getConfigurationInfo());
        
        // 统计信息
        info.put("statistics", getStatisticsInfo());
        
        // 性能信息
        info.put("performance", getPerformanceInfo());
        
        // 缓存信息
        info.put("cache", getCacheInfo());
        
        return info;
    }
    
    /**
     * 获取配置信息
     */
    @ReadOperation
    public Map<String, Object> config() {
        Map<String, Object> config = new HashMap<>();
        
        config.put("enabled", logProperties.isEnabled());
        config.put("defaultLevel", logProperties.getDefaultLevel());
        config.put("webEnabled", logProperties.isWebEnabled());
        config.put("logArgs", logProperties.isLogArgs());
        config.put("logResult", logProperties.isLogResult());
        config.put("logExecutionTime", logProperties.isLogExecutionTime());
        config.put("globalSensitiveFields", logProperties.getGlobalSensitiveFields());
        
        // 请求日志配置
        Map<String, Object> requestLogConfig = new HashMap<>();
        LogProperties.RequestLog requestLog = logProperties.getRequestLog();
        requestLogConfig.put("enabled", requestLog.isEnabled());
        requestLogConfig.put("logHeaders", requestLog.isLogHeaders());
        requestLogConfig.put("logParameters", requestLog.isLogParameters());
        requestLogConfig.put("excludePatterns", requestLog.getExcludePatterns());
        config.put("requestLog", requestLogConfig);
        
        // 性能配置
        Map<String, Object> performanceConfig = new HashMap<>();
        LogProperties.Performance performance = logProperties.getPerformance();
        performanceConfig.put("asyncEnabled", performance.isAsyncEnabled());
        performanceConfig.put("asyncQueueSize", performance.getAsyncQueueSize());
        performanceConfig.put("cacheSize", performance.getCacheSize());
        performanceConfig.put("maxLogLength", performance.getMaxLogLength());
        config.put("performance", performanceConfig);
        
        return config;
    }
    
    /**
     * 获取统计信息
     */
    @ReadOperation
    public Map<String, Object> stats() {
        return getStatisticsInfo();
    }
    
    /**
     * 清除缓存
     */
    @WriteOperation
    public Map<String, Object> clearCache() {
        try {
            configResolver.clearAllCache();
            
            Map<String, Object> result = new HashMap<>();
            result.put("success", true);
            result.put("message", "缓存清除成功");
            result.put("timestamp", LocalDateTime.now());
            
            return result;
            
        } catch (Exception e) {
            Map<String, Object> result = new HashMap<>();
            result.put("success", false);
            result.put("message", "缓存清除失败: " + e.getMessage());
            result.put("timestamp", LocalDateTime.now());
            
            return result;
        }
    }
    
    /**
     * 重置统计信息
     */
    @WriteOperation
    public Map<String, Object> resetStats() {
        totalLogCount.set(0);
        errorLogCount.set(0);
        cacheHitCount.set(0);
        cacheMissCount.set(0);
        startTime = LocalDateTime.now();
        
        Map<String, Object> result = new HashMap<>();
        result.put("success", true);
        result.put("message", "统计信息重置成功");
        result.put("timestamp", LocalDateTime.now());
        
        return result;
    }
    
    // ========== 私有方法 ==========
    
    private Map<String, Object> getConfigurationInfo() {
        Map<String, Object> config = new HashMap<>();
        config.put("enabled", logProperties.isEnabled());
        config.put("defaultLevel", logProperties.getDefaultLevel());
        config.put("webEnabled", logProperties.isWebEnabled());
        config.put("actuatorEnabled", logProperties.isActuatorEnabled());
        return config;
    }
    
    private Map<String, Object> getStatisticsInfo() {
        Map<String, Object> stats = new HashMap<>();
        stats.put("totalLogCount", totalLogCount.get());
        stats.put("errorLogCount", errorLogCount.get());
        stats.put("successRate", calculateSuccessRate());
        stats.put("startTime", startTime);
        stats.put("uptime", calculateUptime());
        return stats;
    }
    
    private Map<String, Object> getPerformanceInfo() {
        Map<String, Object> performance = new HashMap<>();
        
        // 获取JVM性能信息
        Runtime runtime = Runtime.getRuntime();
        performance.put("totalMemory", runtime.totalMemory());
        performance.put("freeMemory", runtime.freeMemory());
        performance.put("maxMemory", runtime.maxMemory());
        performance.put("usedMemory", runtime.totalMemory() - runtime.freeMemory());
        
        // 计算内存使用率
        double memoryUsage = (double) (runtime.totalMemory() - runtime.freeMemory()) / runtime.maxMemory() * 100;
        performance.put("memoryUsagePercentage", String.format("%.2f%%", memoryUsage));
        
        return performance;
    }
    
    private Map<String, Object> getCacheInfo() {
        Map<String, Object> cache = new HashMap<>();
        cache.put("stats", configResolver.getCacheStats());
        cache.put("hitCount", cacheHitCount.get());
        cache.put("missCount", cacheMissCount.get());
        cache.put("hitRate", calculateHitRate());
        return cache;
    }
    
    private String calculateSuccessRate() {
        long total = totalLogCount.get();
        if (total == 0) {
            return "0.00%";
        }
        double rate = (double) (total - errorLogCount.get()) / total * 100;
        return String.format("%.2f%%", rate);
    }
    
    private String calculateUptime() {
        LocalDateTime now = LocalDateTime.now();
        long minutes = java.time.Duration.between(startTime, now).toMinutes();
        long hours = minutes / 60;
        long remainingMinutes = minutes % 60;
        
        return String.format("%d小时%d分钟", hours, remainingMinutes);
    }
    
    private String calculateHitRate() {
        long total = cacheHitCount.get() + cacheMissCount.get();
        if (total == 0) {
            return "0.00%";
        }
        double rate = (double) cacheHitCount.get() / total * 100;
        return String.format("%.2f%%", rate);
    }
    
    // ========== 统计方法(供框架内部调用) ==========
    
    public void incrementLogCount() {
        totalLogCount.incrementAndGet();
    }
    
    public void incrementErrorCount() {
        errorLogCount.incrementAndGet();
    }
    
    public void incrementCacheHit() {
        cacheHitCount.incrementAndGet();
    }
    
    public void incrementCacheMiss() {
        cacheMissCount.incrementAndGet();
    }
}

LogMetrics - 指标收集器

java 复制代码
package com.simpleflow.log.springboot.actuator;

import io.micrometer.core.instrument.Counter;
import io.micrometer.core.instrument.Gauge;
import io.micrometer.core.instrument.MeterRegistry;
import io.micrometer.core.instrument.Timer;
import org.springframework.boot.autoconfigure.condition.ConditionalOnClass;
import org.springframework.stereotype.Component;

import java.util.concurrent.atomic.AtomicLong;

/**
 * 日志框架指标收集器
 * 
 * 集成Micrometer,提供丰富的监控指标
 */
@Component
@ConditionalOnClass(MeterRegistry.class)
public class LogMetrics {
    
    private final MeterRegistry meterRegistry;
    
    // 计数器
    private final Counter logMethodCallCounter;
    private final Counter logErrorCounter;
    private final Counter cacheHitCounter;
    private final Counter cacheMissCounter;
    
    // 计时器
    private final Timer logExecutionTimer;
    private final Timer configResolveTimer;
    
    // 仪表
    private final AtomicLong activeLogContexts;
    private final AtomicLong cacheSize;
    
    public LogMetrics(MeterRegistry meterRegistry) {
        this.meterRegistry = meterRegistry;
        
        // 初始化计数器
        this.logMethodCallCounter = Counter.builder("simpleflow.log.method.calls")
                .description("Total number of log method calls")
                .register(meterRegistry);
        
        this.logErrorCounter = Counter.builder("simpleflow.log.errors")
                .description("Total number of log errors")
                .register(meterRegistry);
        
        this.cacheHitCounter = Counter.builder("simpleflow.log.cache.hits")
                .description("Number of cache hits")
                .register(meterRegistry);
        
        this.cacheMissCounter = Counter.builder("simpleflow.log.cache.misses")
                .description("Number of cache misses")
                .register(meterRegistry);
        
        // 初始化计时器
        this.logExecutionTimer = Timer.builder("simpleflow.log.execution.time")
                .description("Log execution time")
                .register(meterRegistry);
        
        this.configResolveTimer = Timer.builder("simpleflow.log.config.resolve.time")
                .description("Configuration resolve time")
                .register(meterRegistry);
        
        // 初始化仪表
        this.activeLogContexts = new AtomicLong(0);
        this.cacheSize = new AtomicLong(0);
        
        Gauge.builder("simpleflow.log.contexts.active")
             .description("Number of active log contexts")
             .register(meterRegistry, activeLogContexts, AtomicLong::get);
        
        Gauge.builder("simpleflow.log.cache.size")
             .description("Current cache size")
             .register(meterRegistry, cacheSize, AtomicLong::get);
    }
    
    // ========== 指标记录方法 ==========
    
    public void recordMethodCall() {
        logMethodCallCounter.increment();
    }
    
    public void recordError() {
        logErrorCounter.increment();
    }
    
    public void recordCacheHit() {
        cacheHitCounter.increment();
    }
    
    public void recordCacheMiss() {
        cacheMissCounter.increment();
    }
    
    public Timer.Sample startExecutionTimer() {
        return Timer.start(meterRegistry);
    }
    
    public void stopExecutionTimer(Timer.Sample sample) {
        sample.stop(logExecutionTimer);
    }
    
    public Timer.Sample startConfigResolveTimer() {
        return Timer.start(meterRegistry);
    }
    
    public void stopConfigResolveTimer(Timer.Sample sample) {
        sample.stop(configResolveTimer);
    }
    
    public void setActiveContexts(long count) {
        activeLogContexts.set(count);
    }
    
    public void setCacheSize(long size) {
        cacheSize.set(size);
    }
    
    public void incrementActiveContexts() {
        activeLogContexts.incrementAndGet();
    }
    
    public void decrementActiveContexts() {
        activeLogContexts.decrementAndGet();
    }
}

使用和测试

1. 配置启用

yaml 复制代码
# application.yml
management:
  endpoints:
    web:
      exposure:
        include: health,info,simpleflow-log,metrics
  endpoint:
    health:
      show-details: always
  health:
    log:
      enabled: true

simpleflow:
  log:
    actuator-enabled: true

2. 访问端点

bash 复制代码
# 健康检查
curl http://localhost:8080/actuator/health/log

# 查看日志框架信息
curl http://localhost:8080/actuator/simpleflow-log

# 查看配置
curl http://localhost:8080/actuator/simpleflow-log/config

# 查看统计信息
curl http://localhost:8080/actuator/simpleflow-log/stats

# 清除缓存
curl -X POST http://localhost:8080/actuator/simpleflow-log/clearCache

# 查看Prometheus指标
curl http://localhost:8080/actuator/metrics/simpleflow.log.method.calls

3. 输出示例

健康检查输出:

json 复制代码
{
  "status": "UP",
  "details": {
    "core": "UP",
    "memory": "已使用: 256MB, 最大: 1024MB, 使用率: 25.00%",
    "threads": "当前线程: 45, 守护线程: 12, 总启动线程: 67",
    "cache": "MethodCache: 150, ClassCache: 25",
    "lastCheckTime": 1692766815123,
    "uptime": "2小时15分钟30秒"
  }
}

框架信息输出:

json 复制代码
{
  "framework": "SimpleFlow Log Framework",
  "version": "1.0.0",
  "startTime": "2024-08-23T08:30:15",
  "status": "RUNNING",
  "configuration": {
    "enabled": true,
    "defaultLevel": "INFO",
    "webEnabled": true,
    "actuatorEnabled": true
  },
  "statistics": {
    "totalLogCount": 1250,
    "errorLogCount": 3,
    "successRate": "99.76%",
    "uptime": "2小时15分钟"
  },
  "performance": {
    "totalMemory": 268435456,
    "freeMemory": 201326592,
    "usedMemory": 67108864,
    "memoryUsagePercentage": "25.00%"
  },
  "cache": {
    "stats": "MethodCache: 150, ClassCache: 25",
    "hitCount": 980,
    "missCount": 195,
    "hitRate": "83.40%"
  }
}

本章小结

✅ 完成的任务

  1. 健康检查:实现了LogHealthIndicator监控框架状态
  2. 自定义端点:创建了LogEndpoint提供管理功能
  3. 指标收集:集成Micrometer收集性能指标
  4. 运行时管理:支持缓存清理、统计重置等操作
  5. 监控集成:完整的Actuator集成方案

🎯 学习要点

  • Actuator扩展的正确方式
  • 健康检查指标的设计原则
  • 自定义端点的实现技巧
  • 指标收集与监控系统的集成
  • 运行时管理功能的安全考虑

💡 思考题

  1. 如何设计更细粒度的健康检查?
  2. 监控指标的报警阈值如何确定?
  3. 如何保护管理端点的安全性?

🚀 下章预告

最后一章我们将构建完整的示例应用,整合所有功能模块,并进行全面的测试验证,展示框架在实际项目中的应用效果。


💡 设计原则 : 优秀的监控系统应该是主动发现问题、提供详实信息、支持快速响应的。通过Actuator集成,我们让框架具备了生产级的可观测性。

相关推荐
CYRUS_STUDIO2 小时前
一步步带你移植 FART 到 Android 10,实现自动化脱壳
android·java·逆向
练习时长一年2 小时前
Spring代理的特点
java·前端·spring
CYRUS_STUDIO2 小时前
FART 主动调用组件深度解析:破解 ART 下函数抽取壳的终极武器
android·java·逆向
MisterZhang6663 小时前
Java使用apache.commons.math3的DBSCAN实现自动聚类
java·人工智能·机器学习·自然语言处理·nlp·聚类
Swift社区3 小时前
Java 常见异常系列:ClassNotFoundException 类找不到
java·开发语言
一只叫煤球的猫4 小时前
怎么这么多StringUtils——Apache、Spring、Hutool全面对比
java·后端·性能优化
维基框架5 小时前
维基框架 (Wiki FW) v1.1.1 | 企业级微服务开发框架
java·架构
某空_6 小时前
【Android】BottomSheet
java
10km6 小时前
jsqlparser(六):TablesNamesFinder 深度解析与 SQL 格式化实现
java·数据库·sql·jsqlparser
是2的10次方啊6 小时前
Java多线程基础:进程、线程与线程安全实战
java