Spring Boot Actuator 监控机制解析
文章目录
- [Spring Boot Actuator 监控机制解析](#Spring Boot Actuator 监控机制解析)
- [🎯 一、为什么需要 Actuator:从监控到可观测性](#🎯 一、为什么需要 Actuator:从监控到可观测性)
-
- [💡 传统监控的局限性](#💡 传统监控的局限性)
- [🔄 监控到可观测性的演进](#🔄 监控到可观测性的演进)
- [🏗️ 二、核心结构与自动装配原理](#🏗️ 二、核心结构与自动装配原理)
-
- [🏛️ Actuator 架构层次](#🏛️ Actuator 架构层次)
- [🔧 自动装配源码解析](#🔧 自动装配源码解析)
- [📊 端点发现机制](#📊 端点发现机制)
- [💓 三、健康检查(HealthEndpoint)](#💓 三、健康检查(HealthEndpoint))
-
- [🏥 健康检查架构设计](#🏥 健康检查架构设计)
- [🔧 内置健康检查实现](#🔧 内置健康检查实现)
- [💻 自定义健康检查实战](#💻 自定义健康检查实战)
- [📋 健康检查配置优化](#📋 健康检查配置优化)
- [📊 四、Metrics:核心指标体系与 Micrometer 框架](#📊 四、Metrics:核心指标体系与 Micrometer 框架)
-
- [🌡️ Micrometer 架构概述](#🌡️ Micrometer 架构概述)
- [🔧 核心指标收集实战](#🔧 核心指标收集实战)
- [📈 JVM 和系统指标监控](#📈 JVM 和系统指标监控)
- [🔄 Micrometer 与 Prometheus 集成](#🔄 Micrometer 与 Prometheus 集成)
- [ℹ️ 五、InfoEndpoint:应用元信息展示](#ℹ️ 五、InfoEndpoint:应用元信息展示)
-
- [📝 信息收集机制](#📝 信息收集机制)
- [🔧 信息配置实战](#🔧 信息配置实战)
- [🔧 六、自定义 Endpoint](#🔧 六、自定义 Endpoint)
-
- [🛠️ 端点类型与使用场景](#🛠️ 端点类型与使用场景)
- [💻 自定义端点实战](#💻 自定义端点实战)
- [🔒 七、安全与开放配置](#🔒 七、安全与开放配置)
-
- [🛡️ 端点安全配置策略](#🛡️ 端点安全配置策略)
- [📊 端点访问控制矩阵](#📊 端点访问控制矩阵)
- [🔍 端点监控与审计](#🔍 端点监控与审计)
- [💎 八、小结](#💎 八、小结)
-
- [🎯 Actuator 监控体系核心价值](#🎯 Actuator 监控体系核心价值)
- [📈 监控体系演进路线](#📈 监控体系演进路线)
- [🔧 企业级最佳实践](#🔧 企业级最佳实践)
🎯 一、为什么需要 Actuator:从监控到可观测性
💡 传统监控的局限性
传统监控的挑战:
java
// ❌ 传统方式:手动暴露监控信息
@RestController
public class ManualMonitoringController {
@GetMapping("/monitor/health")
public Map<String, Object> healthCheck() {
// 每个应用都要重复实现
Map<String, Object> status = new HashMap<>();
status.put("status", "UP");
status.put("timestamp", System.currentTimeMillis());
// 需要手动检查各个组件...
return status;
}
}
Actuator 的统一解决方案:
java
# ✅ Actuator 方式:标准化监控端点
management:
endpoints:
web:
exposure:
include: health,metrics,info
endpoint:
health:
show-details: always
show-components: always
🔄 监控到可观测性的演进
可观测性三大支柱:
可观测性 指标 Metrics 日志 Logs 追踪 Traces 时序数据 聚合分析 事件记录 上下文信息 请求链路 性能分析
Actuator 在可观测性中的定位:
- 指标收集:通过 Metrics 端点暴露应用指标
- 健康状态:通过 Health 端点提供系统健康度
- 元信息管理:通过 Info 端点展示应用版本和配置
- 自定义监控:通过自定义端点扩展监控能力
🏗️ 二、核心结构与自动装配原理
🏛️ Actuator 架构层次
核心组件关系图:
<<interface>> Endpoint +id() : String +enableByDefault() : boolean +isEnabledByDefault() : boolean <<interface>> WebEndpoint +getOperations() : Collection<Operation> <<interface>> Operation +invoke() : Object EndpointAutoConfiguration +healthEndpoint() : HealthEndpoint +metricsEndpoint() : MetricsEndpoint +infoEndpoint() : InfoEndpoint HealthEndpoint MetricsEndpoint InfoEndpoint HealthIndicator MeterRegistry
🔧 自动装配源码解析
EndpointAutoConfiguration 核心代码:
java
@Configuration(proxyBeanMethods = false)
@AutoConfigureAfter({ HealthEndpointAutoConfiguration.class,
MetricsEndpointAutoConfiguration.class })
public class EndpointAutoConfiguration {
@Bean
@ConditionalOnMissingBean
public HealthEndpoint healthEndpoint(HealthContributorRegistry registry) {
return new HealthEndpoint(registry);
}
@Bean
@ConditionalOnMissingBean
public MetricsEndpoint metricsEndpoint(MeterRegistry registry) {
return new MetricsEndpoint(registry);
}
@Bean
@ConditionalOnMissingBean
public InfoEndpoint infoEndpoint(InfoContributorRegistry registry) {
return new InfoEndpoint(registry);
}
}
Web 端点自动配置:
java
@Configuration(proxyBeanMethods = false)
@ConditionalOnWebApplication
@ConditionalOnClass(WebEndpoint.class)
@AutoConfigureAfter(EndpointAutoConfiguration.class)
public class WebEndpointAutoConfiguration {
@Bean
@ConditionalOnMissingBean
public WebEndpointDiscoverer webEndpointDiscoverer(
ApplicationContext applicationContext) {
return new WebEndpointDiscoverer(applicationContext,
OperationParameterMapper.instance(), EndpointMediaTypes.DEFAULT);
}
@Bean
@ConditionalOnMissingBean
public ControllerEndpointHandlerMapping controllerEndpointHandlerMapping(
WebEndpointDiscoverer discoverer) {
return new ControllerEndpointHandlerMapping(discoverer);
}
}
📊 端点发现机制
端点扫描与注册流程:
java
@Component
public class EndpointDiscoveryDebugger {
@Autowired
private ApplicationContext applicationContext;
@EventListener
public void onApplicationReady(ApplicationReadyEvent event) {
log.info("=== Actuator 端点发现报告 ===");
// 1. 查找所有端点Bean
String[] endpointBeans = applicationContext.getBeanNamesForType(Endpoint.class);
log.info("发现的端点数量: {}", endpointBeans.length);
for (String beanName : endpointBeans) {
Endpoint<?> endpoint = applicationContext.getBean(beanName, Endpoint.class);
log.info("端点: {} (ID: {})", beanName, endpoint.id());
}
// 2. 检查Web端点暴露情况
if (applicationContext.containsBean("webEndpointDiscoverer")) {
WebEndpointDiscoverer discoverer = applicationContext.getBean(
WebEndpointDiscoverer.class);
Collection<WebEndpoint> webEndpoints = discoverer.getEndpoints();
log.info("Web端点数量: {}", webEndpoints.size());
}
}
}
💓 三、健康检查(HealthEndpoint)
🏥 健康检查架构设计
HealthIndicator 层次结构:
HealthIndicator AbstractHealthIndicator DataSourceHealthIndicator DiskSpaceHealthIndicator RedisHealthIndicator 自定义HealthIndicator CompositeHealthIndicator HealthContributorRegistry
🔧 内置健康检查实现
DataSourceHealthIndicator 源码分析:
java
public class DataSourceHealthIndicator extends AbstractHealthIndicator {
private final DataSource dataSource;
@Override
protected void doHealthCheck(Health.Builder builder) throws Exception {
// 1. 检查数据源是否存在
if (this.dataSource == null) {
builder.down().withDetail("database", "Unknown");
return;
}
// 2. 执行健康检查SQL
try (Connection connection = this.dataSource.getConnection()) {
// 3. 获取数据库元信息
DatabaseMetaData metaData = connection.getMetaData();
builder.up()
.withDetail("database", metaData.getDatabaseProductName())
.withDetail("version", metaData.getDatabaseProductVersion())
.withDetail("validationQuery", getValidationQuery());
} catch (Exception ex) {
builder.down(ex);
}
}
}
💻 自定义健康检查实战
企业级健康检查示例:
java
@Component
@Slf4j
public class ComprehensiveHealthIndicator implements HealthIndicator {
@Autowired
private RestTemplate restTemplate;
@Autowired
private RedisTemplate<String, String> redisTemplate;
@Value("${app.external.service.url}")
private String externalServiceUrl;
@Override
public Health health() {
Health.Builder builder = new Health.Builder();
try {
// 1. 检查内部组件状态
checkInternalComponents(builder);
// 2. 检查外部依赖状态
checkExternalDependencies(builder);
// 3. 检查业务逻辑健康度
checkBusinessHealth(builder);
return builder.up().build();
} catch (Exception e) {
log.error("健康检查失败", e);
return builder.down(e).build();
}
}
private void checkInternalComponents(Health.Builder builder) {
// 检查内存使用
Runtime runtime = Runtime.getRuntime();
long maxMemory = runtime.maxMemory();
long usedMemory = runtime.totalMemory() - runtime.freeMemory();
double memoryUsage = (double) usedMemory / maxMemory * 100;
builder.withDetail("memory.used", formatMemory(usedMemory))
.withDetail("memory.max", formatMemory(maxMemory))
.withDetail("memory.usage", String.format("%.1f%%", memoryUsage));
// 内存使用率警告
if (memoryUsage > 90) {
builder.withDetail("memory.warning", "内存使用率过高");
}
}
private void checkExternalDependencies(Health.Builder builder) {
// 检查Redis连接
try {
redisTemplate.opsForValue().get("health-check");
builder.withDetail("redis", "CONNECTED");
} catch (Exception e) {
builder.withDetail("redis", "DISCONNECTED: " + e.getMessage());
builder.down();
}
// 检查外部服务
try {
ResponseEntity<String> response = restTemplate.getForEntity(
externalServiceUrl + "/health", String.class);
builder.withDetail("external.service",
response.getStatusCode().toString());
} catch (Exception e) {
builder.withDetail("external.service", "UNAVAILABLE: " + e.getMessage());
builder.down();
}
}
private void checkBusinessHealth(Health.Builder builder) {
// 检查业务指标
long pendingOrders = getPendingOrderCount();
long failedPayments = getFailedPaymentCount();
builder.withDetail("orders.pending", pendingOrders)
.withDetail("payments.failed", failedPayments);
// 业务健康度规则
if (pendingOrders > 1000) {
builder.withDetail("business.warning", "待处理订单过多");
}
if (failedPayments > 100) {
builder.withDetail("business.warning", "支付失败率较高");
}
}
private String formatMemory(long bytes) {
return String.format("%.1f MB", bytes / (1024.0 * 1024.0));
}
}
📋 健康检查配置优化
分层健康检查配置:
yaml
management:
endpoint:
health:
# 显示详细信息(生产环境谨慎使用)
show-details: when_authorized
show-components: always
# 健康检查组
group:
# 核心组件检查(快速响应)
core:
include: diskSpace, database, redis
# 完整检查(包含外部依赖)
full:
include: core, externalService, business
# 自定义状态映射
status:
order: DOWN, OUT_OF_SERVICE, UP, UNKNOWN
http-mapping:
DOWN: 503
OUT_OF_SERVICE: 503
UP: 200
健康检查组的使用:
bash
# 快速健康检查(只检查核心组件)
curl http://localhost:8080/actuator/health/core
# 完整健康检查(包含所有依赖)
curl http://localhost:8080/actuator/health/full
📊 四、Metrics:核心指标体系与 Micrometer 框架
🌡️ Micrometer 架构概述
Micrometer 指标类型体系:
Meter Counter Gauge Timer DistributionSummary LongTaskTimer 单调递增计数器 瞬时值测量 时间分布统计 值分布统计 长时间任务计时
🔧 核心指标收集实战
自定义业务指标示例:
java
@Service
@Slf4j
public class OrderMetricsService {
private final MeterRegistry meterRegistry;
// 1. 计数器:订单创建数量
private final Counter orderCreatedCounter;
// 2. 计时器:订单处理时间
private final Timer orderProcessingTimer;
// 3. 分布摘要:订单金额分布
private final DistributionSummary orderAmountSummary;
// 4. 计量器:库存数量
private final Gauge inventoryGauge;
public OrderMetricsService(MeterRegistry meterRegistry) {
this.meterRegistry = meterRegistry;
// 初始化指标
this.orderCreatedCounter = Counter.builder("order.created")
.description("订单创建数量")
.tag("application", "order-service")
.register(meterRegistry);
this.orderProcessingTimer = Timer.builder("order.processing.time")
.description("订单处理时间")
.tag("application", "order-service")
.register(meterRegistry);
this.orderAmountSummary = DistributionSummary.builder("order.amount")
.description("订单金额分布")
.baseUnit("CNY")
.register(meterRegistry);
this.inventoryGauge = Gauge.builder("inventory.count")
.description("库存数量")
.tag("product", "all")
.register(meterRegistry, this);
}
@Async
public CompletableFuture<Order> processOrder(Order order) {
// 使用计时器记录处理时间
return orderProcessingTimer.record(() -> {
try {
log.info("开始处理订单: {}", order.getId());
// 记录订单创建
orderCreatedCounter.increment();
// 记录订单金额
orderAmountSummary.record(order.getAmount().doubleValue());
// 模拟业务处理
Thread.sleep(100);
log.info("订单处理完成: {}", order.getId());
return CompletableFuture.completedFuture(order);
} catch (Exception e) {
// 记录错误指标
meterRegistry.counter("order.error",
"reason", e.getClass().getSimpleName()).increment();
throw new RuntimeException("订单处理失败", e);
}
});
}
// Gauge 值提供方法
public double getInventoryCount() {
return getCurrentInventoryCount();
}
}
📈 JVM 和系统指标监控
内置指标自动配置:
java
@Configuration
@ConditionalOnClass(MeterRegistry.class)
public class JvmMetricsAutoConfiguration {
@Bean
public JvmGcMetrics jvmGcMetrics() {
return new JvmGcMetrics();
}
@Bean
public JvmMemoryMetrics jvmMemoryMetrics() {
return new JvmMemoryMetrics();
}
@Bean
public JvmThreadMetrics jvmThreadMetrics() {
return new JvmThreadMetrics();
}
@Bean
public ProcessorMetrics processorMetrics() {
return new ProcessorMetrics();
}
}
关键JVM指标示例:
bash
# 查看JVM内存指标
curl http://localhost:8080/actuator/metrics/jvm.memory.used
# 查看GC指标
curl http://localhost:8080/actuator/metrics/jvm.gc.pause
# 查看线程指标
curl http://localhost:8080/actuator/metrics/jvm.threads.live
🔄 Micrometer 与 Prometheus 集成
Prometheus 配置示例:
yaml
management:
endpoints:
web:
exposure:
include: health,metrics,prometheus
metrics:
export:
prometheus:
enabled: true
step: 1m
tags:
application: ${spring.application.name}
environment: ${spring.profiles.active}
Prometheus 格式指标输出:
text
# HELP jvm_memory_used_bytes The amount of used memory
# TYPE jvm_memory_used_bytes gauge
jvm_memory_used_bytes{application="order-service",environment="prod",area="heap",id="Eden Space",} 1234567.0
jvm_memory_used_bytes{application="order-service",environment="prod",area="heap",id="Survivor Space",} 234567.0
jvm_memory_used_bytes{application="order-service",environment="prod",area="heap",id="Tenured Gen",} 3456789.0
# HELP order_created_total 订单创建数量
# TYPE order_created_total counter
order_created_total{application="order-service",environment="prod",} 1500.0
ℹ️ 五、InfoEndpoint:应用元信息展示
📝 信息收集机制
InfoContributor 架构:
InfoContributor InfoEndpoint Info EnvironmentInfoContributor GitInfoContributor BuildInfoContributor 自定义InfoContributor
🔧 信息配置实战
构建信息配置
xml
<!-- Maven 构建信息生成 -->
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
<executions>
<execution>
<goals>
<goal>build-info</goal>
</goals>
</execution>
</executions>
</plugin>
Git 信息配置:
properties
# application.yml
management:
info:
git:
mode: full
env:
enabled: true
build:
enabled: true
自定义信息贡献器:
java
@Component
public class ComprehensiveInfoContributor implements InfoContributor {
@Autowired
private Environment environment;
@Value("${app.version}")
private String appVersion;
@Override
public void contribute(Info.Builder builder) {
// 1. 应用基本信息
builder.withDetail("application", Map.of(
"name", environment.getProperty("spring.application.name", "unknown"),
"version", appVersion,
"profiles", Arrays.toString(environment.getActiveProfiles()),
"startupTime", LocalDateTime.now().format(DateTimeFormatter.ISO_LOCAL_DATE_TIME)
));
// 2. 系统信息
builder.withDetail("system", Map.of(
"java.version", System.getProperty("java.version"),
"os.name", System.getProperty("os.name"),
"os.version", System.getProperty("os.version"),
"user.timezone", System.getProperty("user.timezone")
));
// 3. 运行时信息
Runtime runtime = Runtime.getRuntime();
builder.withDetail("runtime", Map.of(
"availableProcessors", runtime.availableProcessors(),
"maxMemory", runtime.maxMemory(),
"totalMemory", runtime.totalMemory(),
"freeMemory", runtime.freeMemory()
));
// 4. 业务信息
builder.withDetail("business", Map.of(
"team", "订单中台团队",
"owner", "张工程师",
"slackChannel", "#order-service",
"documentation", "https://wiki.company.com/order-service"
));
}
}
🔧 六、自定义 Endpoint
🛠️ 端点类型与使用场景
端点类型对比:
| 端点类型 | 注解 | 用途 | 典型示例 |
|---|---|---|---|
| 🟢 只读端点 | @ReadOperation |
获取应用状态或统计信息 | 查看缓存状态、系统指标、配置值等 |
| 🟡 写入端点 | @WriteOperation |
执行更新、刷新等操作 | 刷新配置、动态切换日志级别 |
| 🔴 删除端点 | @DeleteOperation |
删除资源或清理数据 | 清空缓存、删除临时文件、清理日志 |
💻 自定义端点实战
业务监控端点示例:
java
@Endpoint(id = "orderstats")
@Component
@Slf4j
public class OrderStatsEndpoint {
@Autowired
private OrderService orderService;
@Autowired
private MeterRegistry meterRegistry;
/**
* 只读操作:获取订单统计信息
*/
@ReadOperation
public OrderStats getOrderStats() {
log.info("查询订单统计信息");
OrderStats stats = new OrderStats();
stats.setTotalOrders(orderService.getTotalOrderCount());
stats.setTodayOrders(orderService.getTodayOrderCount());
stats.setPendingOrders(orderService.getPendingOrderCount());
stats.setAverageAmount(orderService.getAverageOrderAmount());
// 从指标系统获取实时数据
stats.setOrderRate(getCurrentOrderRate());
stats.setErrorRate(getCurrentErrorRate());
return stats;
}
/**
* 写入操作:重置统计计数器
*/
@WriteOperation
public String resetCounters(@Selector String counterType) {
log.info("重置计数器: {}", counterType);
switch (counterType) {
case "all":
meterRegistry.clear();
return "所有计数器已重置";
case "order":
// 重置订单相关计数器
meterRegistry.counter("order.created").close();
return "订单计数器已重置";
default:
throw new IllegalArgumentException("不支持的计数器类型: " + counterType);
}
}
/**
* 删除操作:清理历史数据
*/
@DeleteOperation
public String cleanupOldData(@Nullable Integer days) {
int retentionDays = days != null ? days : 30;
log.info("清理 {} 天前的数据", retentionDays);
int deletedCount = orderService.cleanupOldOrders(retentionDays);
return String.format("已清理 %d 条历史数据", deletedCount);
}
@Data
public static class OrderStats {
private long totalOrders;
private long todayOrders;
private long pendingOrders;
private double averageAmount;
private double orderRate;
private double errorRate;
private Timestamp timestamp = new Timestamp(System.currentTimeMillis());
}
}
Web 专用端点示例:
java
@ControllerEndpoint(id = "webmonitor")
@Component
@Slf4j
public class WebMonitorEndpoint {
private final Map<String, RequestStats> requestStats = new ConcurrentHashMap<>();
/**
* HTTP端点:实时请求监控
*/
@GetMapping("/actuator/webmonitor/requests")
@ResponseBody
public Map<String, Object> getRequestStats() {
Map<String, Object> stats = new HashMap<>();
stats.put("timestamp", System.currentTimeMillis());
stats.put("totalEndpoints", requestStats.size());
stats.put("endpoints", requestStats);
// 计算总体统计
long totalRequests = requestStats.values().stream()
.mapToLong(RequestStats::getRequestCount)
.sum();
double avgResponseTime = requestStats.values().stream()
.mapToDouble(RequestStats::getAverageResponseTime)
.average()
.orElse(0.0);
stats.put("totalRequests", totalRequests);
stats.put("averageResponseTime", avgResponseTime);
return stats;
}
/**
* 记录请求统计
*/
public void recordRequest(String endpoint, long duration, boolean success) {
RequestStats stats = requestStats.computeIfAbsent(endpoint,
k -> new RequestStats());
stats.recordRequest(duration, success);
}
@Data
public static class RequestStats {
private long requestCount;
private long errorCount;
private long totalResponseTime;
private double averageResponseTime;
public void recordRequest(long duration, boolean success) {
this.requestCount++;
if (!success) this.errorCount++;
this.totalResponseTime += duration;
this.averageResponseTime = (double) totalResponseTime / requestCount;
}
}
}
🔒 七、安全与开放配置
🛡️ 端点安全配置策略
分层安全配置示例:
yaml
# 开发环境配置
spring:
profiles: dev
management:
endpoints:
web:
exposure:
include: '*'
base-path: /manage
endpoint:
health:
show-details: always
loggers:
enabled: true
# 生产环境配置
spring:
profiles: prod
management:
endpoints:
web:
exposure:
include: health,metrics,info
base-path: /internal/manage
endpoint:
health:
show-details: never
show-components: never
shutdown:
enabled: false
Spring Security 集成配置:
java
@Configuration
@EnableWebSecurity
public class ActuatorSecurityConfig {
@Configuration
@Order(1)
public static class ActuatorWebSecurityConfigurationAdapter extends WebSecurityConfigurerAdapter {
@Override
protected void configure(HttpSecurity http) throws Exception {
http.antMatcher("/manage/**")
.authorizeRequests()
// 健康端点对所有人开放
.antMatchers("/manage/health").permitAll()
// 指标端点需要监控角色
.antMatchers("/manage/metrics").hasRole("MONITOR")
// 敏感操作需要管理员角色
.antMatchers("/manage/**").hasRole("ADMIN")
.and()
.httpBasic();
}
}
}
📊 端点访问控制矩阵
角色权限矩阵:
| 端点 | 匿名用户 | MONITOR 角色 | ADMIN 角色 | 说明 / 使用场景 |
|---|---|---|---|---|
/actuator/health |
✅ | ✅ | ✅ | 健康检查,对所有用户开放(K8s、Nacos、负载均衡探活使用) |
/actuator/info |
❌ | ✅ | ✅ | 应用基本信息,如版本、描述等,需认证访问 |
/actuator/metrics |
❌ | ✅ | ✅ | 性能指标数据(CPU、线程、JVM),监控角色可访问 |
/actuator/env |
❌ | ❌ | ✅ | 环境变量与配置属性,敏感信息,仅管理员访问 |
/actuator/loggers |
❌ | ❌ | ✅ | 动态调整日志级别,仅管理员使用 |
/actuator/shutdown |
❌ | ❌ | ✅ | 远程关闭应用(需显式启用),仅管理员访问 |
🔍 端点监控与审计
端点访问审计日志:
java
@Component
public class EndpointAccessAuditor {
private static final Logger auditLogger = LoggerFactory.getLogger("ENDPOINT_AUDIT");
@EventListener
public void auditEndpointAccess(EndpointExposedEvent event) {
HttpServletRequest request = event.getRequest();
auditLogger.info("端点访问审计 - 用户: {}, 端点: {}, IP: {}, 时间: {}",
getCurrentUser(),
request.getRequestURI(),
request.getRemoteAddr(),
Instant.now());
}
@EventListener
public void auditSensitiveOperation(SensitiveOperationEvent event) {
auditLogger.warn("敏感操作审计 - 用户: {}, 操作: {}, 参数: {}, 结果: {}",
event.getUsername(),
event.getOperation(),
event.getParameters(),
event.getResult());
}
}
💎 八、小结
🎯 Actuator 监控体系核心价值
生产级监控检查清单:
- ✅ 健康检查:多层级健康状态监控
- ✅ 指标收集:业务+系统指标全面覆盖
- ✅ 信息管理:应用元数据标准化展示
- ✅ 安全控制:基于角色的端点访问控制
- ✅ 扩展能力:自定义端点满足业务需求
- ✅ 集成能力:与 Prometheus、Grafana 等工具无缝集成
📈 监控体系演进路线
监控成熟度模型:
Level 1: 基础监控 Level 2: 业务监控 Level 3: 预测性监控 Level 4: 智能运维 系统指标 基础健康检查 业务指标 自定义端点 趋势预测 异常检测 自动修复 智能调度
🔧 企业级最佳实践
监控配置模板:
yaml
# 企业级监控配置
management:
endpoints:
web:
exposure:
include: health,metrics,info,prometheus
base-path: /internal/actuator
jmx:
exposure:
include: '*'
metrics:
export:
prometheus:
enabled: true
datadog:
enabled: false
tags:
application: ${spring.application.name}
environment: ${spring.profiles.active}
cluster: ${app.cluster.name:default}
endpoint:
health:
show-details: when_authorized
show-components: when_authorized
group:
liveness:
include: ping,diskSpace
readiness:
include: database,redis,externalService