在分布式系统中,消息队列是实现系统解耦、异步处理和削峰填谷的重要手段。然而,在实际生产环境中,我们经常遇到以下问题:
性能瓶颈难定位:消息从创建到发送的整个链路中,哪个环节耗时最长,难以快速定位
链路追踪困难:消息经过多个服务和组件,如何追踪消息的完整生命周期
性能优化无依据:没有详细的耗时数据,性能优化只能凭经验猜测
问题排查效率低:当消息发送失败或延迟过高时,缺乏有效的排查手段
监控数据不完整:传统的监控只能看到端到端的延迟,无法了解每个环节的详细情况
这些问题导致我们在面对性能问题时,往往束手无策,只能通过大量的日志排查,效率低下。本文将介绍如何使用 SpringBoot 实现消息生产链路追踪和耗时分析,实现全链路性能可视化,让性能问题一目了然。
核心概念
1. 消息生产链路
消息生产链路是指消息从创建到发送到消息队列的整个过程,通常包括以下环节:

2. 链路追踪
链路追踪是指在分布式系统中,跟踪一个请求或消息在整个系统中的流转过程。实现方式通常有:

3. 耗时分析
耗时分析是指对消息生产链路中每个环节的耗时进行统计和分析,找出性能瓶颈。核心指标包括:

4. 性能可视化
性能可视化是指将链路追踪和耗时分析的结果以图形化的方式展示,让性能问题一目了然。常见方式包括:

技术实现
1. 核心依赖
java
<!-- Spring Boot Starter Web -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<!-- Spring Boot Starter AOP -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-aop</artifactId>
</dependency>
<!-- Spring Boot Starter Data Redis -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-redis</artifactId>
</dependency>
<!-- Spring Boot Starter Data JPA -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-jpa</artifactId>
</dependency>
<!-- H2 Database -->
<dependency>
<groupId>com.h2database</groupId>
<artifactId>h2</artifactId>
<scope>runtime</scope>
</dependency>
<!-- FastJSON -->
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>fastjson</artifactId>
<version>1.2.83</version>
</dependency>
<!-- Lombok -->
<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
<optional>true</optional>
</dependency>
<!-- Micrometer Tracing (可选) -->
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-tracing</artifactId>
</dependency>
2. 追踪上下文
java
@Data
publicclass TraceContext {
private String traceId; // 追踪ID
private String spanId; // Span ID
private String parentSpanId; // 父Span ID
privatelong startTime; // 开始时间
private Map<String, Object> tags; // 标签
public static TraceContext create() {
TraceContext context = new TraceContext();
context.setTraceId(UUID.randomUUID().toString());
context.setSpanId(UUID.randomUUID().toString());
context.setStartTime(System.currentTimeMillis());
context.setTags(new HashMap<>());
return context;
}
public static TraceContext createChild(TraceContext parent) {
TraceContext context = new TraceContext();
context.setTraceId(parent.getTraceId());
context.setSpanId(UUID.randomUUID().toString());
context.setParentSpanId(parent.getSpanId());
context.setStartTime(System.currentTimeMillis());
context.setTags(new HashMap<>());
return context;
}
}
3. 链路追踪注解
java
@Target(ElementType.METHOD)
@Retention(RetentionPolicy.RUNTIME)
public @interface Traceable {
String value() default ""; // 操作名称
String category() default "default"; // 分类
}
4. 链路追踪切面
java
@Aspect
@Component
@Slf4j
publicclass TraceAspect {
@Autowired
private TraceRecordService traceRecordService;
@Autowired
private ThreadLocal<TraceContext> traceContextHolder;
@Around("@annotation(traceable)")
public Object trace(ProceedingJoinPoint joinPoint, Traceable traceable) throws Throwable {
TraceContext context = traceContextHolder.get();
if (context == null) {
context = TraceContext.create();
traceContextHolder.set(context);
} else {
context = TraceContext.createChild(context);
}
String operationName = traceable.value().isEmpty() ?
joinPoint.getSignature().getName() : traceable.value();
String category = traceable.category();
long startTime = System.currentTimeMillis();
try {
Object result = joinPoint.proceed();
long endTime = System.currentTimeMillis();
long duration = endTime - startTime;
// 记录追踪数据
TraceRecord record = new TraceRecord();
record.setTraceId(context.getTraceId());
record.setSpanId(context.getSpanId());
record.setParentSpanId(context.getParentSpanId());
record.setOperationName(operationName);
record.setCategory(category);
record.setStartTime(startTime);
record.setEndTime(endTime);
record.setDuration(duration);
record.setStatus("SUCCESS");
record.setTags(JSON.toJSONString(context.getTags()));
traceRecordService.save(record);
log.info("Trace: {}, Operation: {}, Duration: {}ms",
context.getTraceId(), operationName, duration);
return result;
} catch (Exception e) {
long endTime = System.currentTimeMillis();
long duration = endTime - startTime;
// 记录追踪数据
TraceRecord record = new TraceRecord();
record.setTraceId(context.getTraceId());
record.setSpanId(context.getSpanId());
record.setParentSpanId(context.getParentSpanId());
record.setOperationName(operationName);
record.setCategory(category);
record.setStartTime(startTime);
record.setEndTime(endTime);
record.setDuration(duration);
record.setStatus("FAILED");
record.setErrorMessage(e.getMessage());
record.setTags(JSON.toJSONString(context.getTags()));
traceRecordService.save(record);
log.error("Trace: {}, Operation: {}, Duration: {}ms, Error: {}",
context.getTraceId(), operationName, duration, e.getMessage(), e);
throw e;
}
}
}
5. 消息生产追踪
java
@Service
@Slf4j
publicclass MessageProductionService {
@Autowired
private TraceRecordService traceRecordService;
@Autowired
private ThreadLocal<TraceContext> traceContextHolder;
/**
* 生产消息
*/
@Traceable(value = "produceMessage", category = "message")
public void produceMessage(Message message) {
// 1. 消息创建
createMessage(message);
// 2. 消息序列化
String serializedMessage = serializeMessage(message);
// 3. 消息验证
validateMessage(message);
// 4. 消息增强
enrichMessage(message);
// 5. 消息压缩
String compressedMessage = compressMessage(serializedMessage);
// 6. 消息加密
String encryptedMessage = encryptMessage(compressedMessage);
// 7. 发送消息
sendMessage(encryptedMessage);
}
/**
* 创建消息
*/
@Traceable(value = "createMessage", category = "message")
private void createMessage(Message message) {
// 创建消息逻辑
message.setId(UUID.randomUUID().toString());
message.setCreateTime(System.currentTimeMillis());
log.debug("Message created: {}", message.getId());
}
/**
* 序列化消息
*/
@Traceable(value = "serializeMessage", category = "message")
private String serializeMessage(Message message) {
// 序列化消息逻辑
String json = JSON.toJSONString(message);
log.debug("Message serialized: {}", json);
return json;
}
/**
* 验证消息
*/
@Traceable(value = "validateMessage", category = "message")
private void validateMessage(Message message) {
// 验证消息逻辑
if (message.getId() == null || message.getId().isEmpty()) {
thrownew IllegalArgumentException("Message ID is required");
}
log.debug("Message validated: {}", message.getId());
}
/**
* 增强消息
*/
@Traceable(value = "enrichMessage", category = "message")
private void enrichMessage(Message message) {
// 增强消息逻辑
message.setEnrichTime(System.currentTimeMillis());
message.setMetadata("enriched");
log.debug("Message enriched: {}", message.getId());
}
/**
* 压缩消息
*/
@Traceable(value = "compressMessage", category = "message")
private String compressMessage(String message) {
// 压缩消息逻辑
try {
ByteArrayOutputStream baos = new ByteArrayOutputStream();
GZIPOutputStream gzos = new GZIPOutputStream(baos);
gzos.write(message.getBytes());
gzos.close();
String compressed = Base64.getEncoder().encodeToString(baos.toByteArray());
log.debug("Message compressed: {} -> {}", message.length(), compressed.length());
return compressed;
} catch (Exception e) {
thrownew RuntimeException("Failed to compress message", e);
}
}
/**
* 加密消息
*/
@Traceable(value = "encryptMessage", category = "message")
private String encryptMessage(String message) {
// 加密消息逻辑
try {
Cipher cipher = Cipher.getInstance("AES");
SecretKeySpec keySpec = new SecretKeySpec("1234567890123456".getBytes(), "AES");
cipher.init(Cipher.ENCRYPT_MODE, keySpec);
byte[] encrypted = cipher.doFinal(message.getBytes());
String encryptedMessage = Base64.getEncoder().encodeToString(encrypted);
log.debug("Message encrypted: {}", encryptedMessage);
return encryptedMessage;
} catch (Exception e) {
thrownew RuntimeException("Failed to encrypt message", e);
}
}
/**
* 发送消息
*/
@Traceable(value = "sendMessage", category = "message")
private void sendMessage(String message) {
// 发送消息逻辑
log.debug("Message sent: {}", message);
}
}
6. 耗时分析服务
java
@Service
@Slf4j
publicclass PerformanceAnalysisService {
@Autowired
private TraceRecordRepository traceRecordRepository;
/**
* 分析消息生产耗时
*/
public PerformanceReport analyzePerformance(String traceId) {
List<TraceRecord> records = traceRecordRepository.findByTraceIdOrderByStartTime(traceId);
if (records.isEmpty()) {
thrownew IllegalArgumentException("No trace records found for trace ID: " + traceId);
}
PerformanceReport report = new PerformanceReport();
report.setTraceId(traceId);
// 计算总耗时
long totalDuration = records.stream()
.mapToLong(TraceRecord::getDuration)
.sum();
report.setTotalDuration(totalDuration);
// 计算各环节耗时
Map<String, Long> operationDurations = new HashMap<>();
for (TraceRecord record : records) {
operationDurations.merge(record.getOperationName(), record.getDuration(), Long::sum);
}
report.setOperationDurations(operationDurations);
// 计算耗时占比
Map<String, Double> durationPercentages = new HashMap<>();
for (Map.Entry<String, Long> entry : operationDurations.entrySet()) {
double percentage = (double) entry.getValue() / totalDuration * 100;
durationPercentages.put(entry.getKey(), percentage);
}
report.setDurationPercentages(durationPercentages);
// 识别性能瓶颈
String bottleneck = operationDurations.entrySet().stream()
.max(Map.Entry.comparingByValue())
.map(Map.Entry::getKey)
.orElse(null);
report.setBottleneck(bottleneck);
return report;
}
/**
* 分析历史性能数据
*/
public HistoricalPerformanceReport analyzeHistoricalPerformance(String operationName,
Date startTime, Date endTime) {
List<TraceRecord> records = traceRecordRepository
.findByOperationNameAndStartTimeBetween(operationName, startTime, endTime);
if (records.isEmpty()) {
thrownew IllegalArgumentException("No trace records found for operation: " + operationName);
}
HistoricalPerformanceReport report = new HistoricalPerformanceReport();
report.setOperationName(operationName);
report.setStartTime(startTime);
report.setEndTime(endTime);
report.setSampleCount(records.size());
// 计算统计指标
DoubleSummaryStatistics stats = records.stream()
.mapToLong(TraceRecord::getDuration)
.mapToDouble(Long::doubleValue)
.summaryStatistics();
report.setAverageDuration(stats.getAverage());
report.setMaxDuration(stats.getMax());
report.setMinDuration(stats.getMin());
report.setTotalDuration(stats.getSum());
// 计算P95和P99
List<Long> durations = records.stream()
.map(TraceRecord::getDuration)
.sorted()
.collect(Collectors.toList());
report.setP95Duration(durations.get((int) (durations.size() * 0.95)));
report.setP99Duration(durations.get((int) (durations.size() * 0.99)));
// 计算成功率
long successCount = records.stream()
.filter(r -> "SUCCESS".equals(r.getStatus()))
.count();
report.setSuccessRate((double) successCount / records.size() * 100);
return report;
}
/**
* 获取性能趋势
*/
public List<PerformanceTrend> getPerformanceTrend(String operationName,
Date startTime, Date endTime, long interval) {
List<TraceRecord> records = traceRecordRepository
.findByOperationNameAndStartTimeBetween(operationName, startTime, endTime);
Map<Long, List<TraceRecord>> groupedRecords = records.stream()
.collect(Collectors.groupingBy(r ->
r.getStartTime() / interval * interval));
List<PerformanceTrend> trends = new ArrayList<>();
for (Map.Entry<Long, List<TraceRecord>> entry : groupedRecords.entrySet()) {
PerformanceTrend trend = new PerformanceTrend();
trend.setTimestamp(entry.getKey() * interval);
List<TraceRecord> groupRecords = entry.getValue();
DoubleSummaryStatistics stats = groupRecords.stream()
.mapToLong(TraceRecord::getDuration)
.mapToDouble(Long::doubleValue)
.summaryStatistics();
trend.setAverageDuration(stats.getAverage());
trend.setMaxDuration(stats.getMax());
trend.setMinDuration(stats.getMin());
trend.setCount(groupRecords.size());
trends.add(trend);
}
trends.sort(Comparator.comparing(PerformanceTrend::getTimestamp));
return trends;
}
}
7. 性能可视化服务
java
@Service
@Slf4j
publicclass PerformanceVisualizationService {
@Autowired
private PerformanceAnalysisService performanceAnalysisService;
/**
* 生成时序图数据
*/
public TimeSeriesData generateTimeSeriesData(String traceId) {
PerformanceReport report = performanceAnalysisService.analyzePerformance(traceId);
TimeSeriesData data = new TimeSeriesData();
data.setTraceId(traceId);
List<TimeSeriesItem> items = new ArrayList<>();
long baseTime = System.currentTimeMillis();
for (Map.Entry<String, Long> entry : report.getOperationDurations().entrySet()) {
TimeSeriesItem item = new TimeSeriesItem();
item.setOperationName(entry.getKey());
item.setDuration(entry.getValue());
item.setPercentage(report.getDurationPercentages().get(entry.getKey()));
items.add(item);
}
data.setItems(items);
return data;
}
/**
* 生成甘特图数据
*/
public GanttChartData generateGanttChartData(String traceId) {
PerformanceReport report = performanceAnalysisService.analyzePerformance(traceId);
GanttChartData data = new GanttChartData();
data.setTraceId(traceId);
List<GanttChartItem> items = new ArrayList<>();
long startTime = System.currentTimeMillis();
for (Map.Entry<String, Long> entry : report.getOperationDurations().entrySet()) {
GanttChartItem item = new GanttChartItem();
item.setOperationName(entry.getKey());
item.setStartTime(startTime);
item.setDuration(entry.getValue());
item.setEndTime(startTime + entry.getValue());
items.add(item);
startTime += entry.getValue();
}
data.setItems(items);
return data;
}
/**
* 生成统计图表数据
*/
public StatisticsChartData generateStatisticsChartData(String operationName,
Date startTime, Date endTime) {
HistoricalPerformanceReport report = performanceAnalysisService
.analyzeHistoricalPerformance(operationName, startTime, endTime);
StatisticsChartData data = new StatisticsChartData();
data.setOperationName(operationName);
data.setSampleCount(report.getSampleCount());
data.setAverageDuration(report.getAverageDuration());
data.setMaxDuration(report.getMaxDuration());
data.setMinDuration(report.getMinDuration());
data.setP95Duration(report.getP95Duration());
data.setP99Duration(report.getP99Duration());
data.setSuccessRate(report.getSuccessRate());
return data;
}
/**
* 生成性能趋势图数据
*/
public TrendChartData generateTrendChartData(String operationName,
Date startTime, Date endTime, long interval) {
List<PerformanceTrend> trends = performanceAnalysisService
.getPerformanceTrend(operationName, startTime, endTime, interval);
TrendChartData data = new TrendChartData();
data.setOperationName(operationName);
data.setTrends(trends);
return data;
}
}
8. 消息控制器
java
@RestController
@RequestMapping("/api/message")
@Slf4j
publicclass MessageController {
@Autowired
private MessageProductionService messageProductionService;
@Autowired
private PerformanceAnalysisService performanceAnalysisService;
@Autowired
private PerformanceVisualizationService visualizationService;
/**
* 生产消息
*/
@PostMapping("/produce")
public Result<String> produceMessage(@RequestBody Message message) {
messageProductionService.produceMessage(message);
return Result.success("Message produced successfully");
}
/**
* 分析性能
*/
@GetMapping("/performance/{traceId}")
public Result<PerformanceReport> analyzePerformance(@PathVariable String traceId) {
PerformanceReport report = performanceAnalysisService.analyzePerformance(traceId);
return Result.success(report);
}
/**
* 获取时序图数据
*/
@GetMapping("/visualization/timeseries/{traceId}")
public Result<TimeSeriesData> getTimeSeriesData(@PathVariable String traceId) {
TimeSeriesData data = visualizationService.generateTimeSeriesData(traceId);
return Result.success(data);
}
/**
* 获取甘特图数据
*/
@GetMapping("/visualization/gantt/{traceId}")
public Result<GanttChartData> getGanttChartData(@PathVariable String traceId) {
GanttChartData data = visualizationService.generateGanttChartData(traceId);
return Result.success(data);
}
/**
* 获取统计图表数据
*/
@GetMapping("/visualization/statistics")
public Result<StatisticsChartData> getStatisticsChartData(
@RequestParam String operationName,
@RequestParam Date startTime,
@RequestParam Date endTime) {
StatisticsChartData data = visualizationService
.generateStatisticsChartData(operationName, startTime, endTime);
return Result.success(data);
}
/**
* 获取性能趋势图数据
*/
@GetMapping("/visualization/trend")
public Result<TrendChartData> getTrendChartData(
@RequestParam String operationName,
@RequestParam Date startTime,
@RequestParam Date endTime,
@RequestParam(defaultValue = "3600000") long interval) {
TrendChartData data = visualizationService
.generateTrendChartData(operationName, startTime, endTime, interval);
return Result.success(data);
}
}
核心流程
1. 消息生产流程
接收请求:接收消息生产请求
创建追踪上下文:为当前请求创建追踪上下文
执行消息生产:执行消息生产的各个环节
记录追踪数据:记录每个环节的追踪数据
返回结果:返回消息生产结果
2. 链路追踪流程
创建Trace ID:为整个消息生产链路生成唯一的Trace ID
创建Span ID:为每个环节生成唯一的Span ID
记录开始时间:记录每个环节的开始时间
执行业务逻辑:执行具体的业务逻辑
记录结束时间:记录每个环节的结束时间
计算耗时:计算每个环节的耗时
保存追踪数据:将追踪数据保存到数据库
3. 耗时分析流程
查询追踪数据:根据Trace ID查询相关的追踪数据
计算总耗时:计算消息生产的总耗时
计算各环节耗时:计算每个环节的耗时
计算耗时占比:计算每个环节的耗时占比
识别性能瓶颈:识别耗时最长的环节
生成分析报告:生成性能分析报告
4. 性能可视化流程
分析性能数据:分析性能数据,生成统计指标
生成可视化数据:根据可视化类型生成对应的数据
返回可视化数据:返回可视化数据给前端
技术要点
1. 追踪上下文管理
ThreadLocal:使用ThreadLocal存储追踪上下文,确保线程安全
父子关系:维护Span之间的父子关系,构建完整的调用链
标签扩展:支持自定义标签,记录额外的上下文信息
自动清理:请求结束后自动清理ThreadLocal,避免内存泄漏
2. 链路追踪切面
注解驱动:使用注解标记需要追踪的方法,使用简单
自动记录:自动记录方法的开始时间、结束时间和耗时
异常处理:自动捕获异常,记录错误信息
性能影响小:使用AOP实现,对业务代码无侵入
3. 耗时分析
统计分析:提供平均、最大、最小、P95、P99等统计指标
趋势分析:分析性能随时间的变化趋势
瓶颈识别:自动识别性能瓶颈
成功率统计:统计操作的成功率
4. 性能可视化
多种图表:支持时序图、甘特图、统计图表、趋势图等多种图表
实时数据:支持实时性能数据的展示
交互式:支持交互式查询和筛选
导出功能:支持导出分析结果
最佳实践
1. 追踪粒度
合理划分:合理划分追踪的粒度,避免过细或过粗
关键环节:重点追踪关键环节,如网络IO、数据库操作等
避免过度追踪:避免追踪所有方法,影响性能
动态调整:根据实际情况动态调整追踪粒度
2. 数据存储
定期清理:定期清理历史数据,避免数据量过大
数据分区:按时间分区存储,提高查询效率
索引优化:为常用查询字段建立索引
冷热分离:将冷数据迁移到低成本存储
3. 性能优化
异步记录:使用异步方式记录追踪数据,避免影响业务性能
批量写入:批量写入追踪数据,减少IO操作
缓存优化:缓存常用的分析结果
采样策略:对高频操作采用采样策略,减少数据量
4. 监控告警
阈值告警:设置性能阈值,超过阈值时告警
趋势告警:监控性能趋势,发现异常及时告警
成功率告警:监控操作成功率,低于阈值时告警
资源告警:监控系统资源使用情况,避免资源耗尽
常见问题
1. 追踪数据丢失
问题:追踪数据丢失,无法完整追踪消息链路
解决方案:
确保追踪上下文在所有环节中正确传递
使用异步记录时,确保数据不丢失
实现数据备份机制,避免数据丢失
定期检查追踪数据的完整性
2. 性能影响过大
问题:链路追踪对系统性能影响过大
解决方案:
合理设置追踪粒度,避免过度追踪
使用异步记录,减少对业务性能的影响
对高频操作采用采样策略
优化数据存储和查询逻辑
3. 数据量过大
问题:追踪数据量过大,存储和查询困难
解决方案:
定期清理历史数据
按时间分区存储
使用高效的存储方案
实现数据压缩和归档
4. 分析结果不准确
问题:耗时分析结果不准确,无法反映真实性能
解决方案:
确保时间记录的准确性
排除异常数据的影响
使用统计方法处理数据
结合多种分析方法
5. 可视化效果不佳
问题:性能可视化效果不佳,难以发现问题
解决方案:
选择合适的可视化方式
提供交互式查询和筛选
支持多种图表类型
优化图表的展示效果
代码优化建议
1. 异步追踪记录
java
@Service
@Slf4j
publicclass AsyncTraceRecordService {
@Autowired
private TraceRecordRepository traceRecordRepository;
privatefinal ExecutorService executor = Executors.newFixedThreadPool(4);
privatefinal BlockingQueue<TraceRecord> recordQueue = new LinkedBlockingQueue<>(10000);
@PostConstruct
public void init() {
// 启动4个线程处理追踪数据
for (int i = 0; i < 4; i++) {
executor.submit(this::processRecords);
}
}
/**
* 保存追踪数据(异步)
*/
public void saveAsync(TraceRecord record) {
boolean added = recordQueue.offer(record);
if (!added) {
log.warn("Trace record queue full, fallback to sync save");
traceRecordRepository.save(record);
}
}
/**
* 处理追踪数据
*/
private void processRecords() {
List<TraceRecord> batch = new ArrayList<>();
while (!Thread.currentThread().isInterrupted()) {
try {
// 批量获取追踪数据
TraceRecord record = recordQueue.poll(100, TimeUnit.MILLISECONDS);
if (record != null) {
batch.add(record);
}
// 批量保存
if (batch.size() >= 100) {
traceRecordRepository.saveAll(batch);
batch.clear();
}
} catch (InterruptedException e) {
log.info("Trace record processing interrupted");
Thread.currentThread().interrupt();
} catch (Exception e) {
log.error("Failed to process trace records: {}", e.getMessage(), e);
}
}
// 保存剩余数据
if (!batch.isEmpty()) {
traceRecordRepository.saveAll(batch);
}
}
@PreDestroy
public void shutdown() {
executor.shutdown();
}
}
2. 采样策略
java
@Service
@Slf4j
publicclass SamplingTraceService {
privatefinal AtomicInteger counter = new AtomicInteger(0);
privatefinalint sampleRate = 10; // 10%采样率
/**
* 判断是否需要追踪
*/
public boolean shouldTrace() {
return counter.incrementAndGet() % sampleRate == 0;
}
/**
* 判断是否需要追踪(动态采样率)
*/
public boolean shouldTrace(String operationName) {
// 根据操作名称动态调整采样率
int rate = getSampleRate(operationName);
return counter.incrementAndGet() % rate == 0;
}
/**
* 获取采样率
*/
private int getSampleRate(String operationName) {
// 根据操作类型返回不同的采样率
switch (operationName) {
case"sendMessage":
return1; // 发送消息100%追踪
case"compressMessage":
return10; // 压缩消息10%追踪
default:
return5; // 其他操作20%追踪
}
}
}
3. 性能分析缓存
java
@Service
@Slf4j
publicclass CachedPerformanceAnalysisService {
@Autowired
private PerformanceAnalysisService performanceAnalysisService;
privatefinal Cache<String, PerformanceReport> reportCache =
Caffeine.newBuilder()
.maximumSize(1000)
.expireAfterWrite(10, TimeUnit.MINUTES)
.build();
/**
* 分析性能(带缓存)
*/
public PerformanceReport analyzePerformanceWithCache(String traceId) {
return reportCache.get(traceId, key ->
performanceAnalysisService.analyzePerformance(traceId));
}
/**
* 清除缓存
*/
public void clearCache() {
reportCache.invalidateAll();
}
}
性能测试
测试环境
服务器:4核8G,100Mbps带宽
消息大小:1KB/条
测试场景:生产10000条消息
测试结果

测试结论
异步追踪:异步追踪对性能影响最小,仅增加4%的耗时
采样追踪:采样追踪在保证数据代表性的同时,将数据量减少90%
性能可接受:即使使用同步追踪,性能影响也在可接受范围内
数据完整:异步追踪和采样追踪都能保证数据的完整性