引言:一次生产环境下的"日志迷宫"排查经历
凌晨2点,你被紧急电话叫醒:"线上订单支付成功率突然从99.9%暴跌到85%!"你匆忙打开日志系统,看到的却是这样的场景:
2024-03-20 02:15:23.123 INFO [http-nio-8080-exec-1] c.e.s.OrderService : 开始处理订单
2024-03-20 02:15:23.145 INFO [http-nio-8080-exec-3] c.e.s.PaymentService : 调用支付网关
2024-03-20 02:15:23.167 ERROR [http-nio-8080-exec-5] c.e.s.NotificationService : 发送短信失败
2024-03-20 02:15:23.189 INFO [http-nio-8080-exec-2] c.e.s.OrderService : 订单处理完成
问题显而易见 :这些日志来自同一个用户请求 ,但却分散在不同的线程中,你根本无法将它们串联起来!每个微服务、每个线程都在独立记录日志,形成一个混乱的"日志迷宫"。
这正是MDC(Mapped Diagnostic Context) 要解决的核心问题:为每个请求创建一个唯一的"身份标识",让所有相关日志自动带上这个标识,实现端到端的追踪。
第一章:MDC基础------不仅仅是ThreadLocal的封装
1.1 MDC的本质是什么?
MDC(Mapped Diagnostic Context)是SLF4J提供的一个线程绑定的诊断上下文 。你可以把它理解为一个Map<String, String>,但这个Map是绑定到当前线程的。
// 本质上,MDC是这样工作的:
public class MDC {
// 内部使用ThreadLocal
private static final ThreadLocal<Map<String, String>> context =
new ThreadLocal<Map<String, String>>();
public static void put(String key, String val) {
// 将键值对存储到当前线程的Map中
}
public static String get(String key) {
// 从当前线程的Map中获取值
}
public static void clear() {
// 清理当前线程的Map
}
}
1.2 为什么选择MDC而不是其他方案?
| 方案 | 优点 | 缺点 | 适用场景 |
|---|---|---|---|
| MDC | 轻量、零侵入、与日志框架天然集成 | 线程传递需要手动处理 | 单体应用、简单微服务 |
| OpenTelemetry | 功能强大、标准化、跨语言 | 较重、需要额外依赖 | 复杂微服务、多语言系统 |
| Spring Cloud Sleuth | Spring生态原生支持、自动集成 | 依赖Spring Cloud、较重 | Spring Cloud微服务 |
| 自定义ThreadLocal | 完全可控 | 需要自己管理生命周期、易内存泄漏 | 特定定制需求 |
MDC的核心优势:
-
零代码侵入:只需在入口处设置,所有日志自动携带
-
性能开销小:基本无性能影响
-
与日志框架完美集成:Logback、Log4j2都原生支持
第二章:SpringBoot中MDC的基础应用
2.1 最简实现:5分钟让日志"说话"
步骤1:创建过滤器设置请求ID
@Component
public class TraceIdFilter implements Filter {
private static final String TRACE_ID = "traceId";
@Override
public void doFilter(ServletRequest request, ServletResponse response,
FilterChain chain) throws IOException, ServletException {
// 生成唯一追踪ID
String traceId = generateTraceId();
try {
// 1. 将traceId放入MDC
MDC.put(TRACE_ID, traceId);
// 2. 可选:将traceId添加到响应头,方便前端调试
if (response instanceof HttpServletResponse) {
((HttpServletResponse) response).addHeader("X-Trace-Id", traceId);
}
// 3. 继续处理请求
chain.doFilter(request, response);
} finally {
// 4. 关键!请求结束后清理MDC,防止内存泄漏
MDC.clear();
}
}
private String generateTraceId() {
// 方案1:UUID(通用但较长)
// return UUID.randomUUID().toString();
// 方案2:时间戳+随机数(较短)
// return System.currentTimeMillis() + "-" + (int)(Math.random() * 1000);
// 方案3:推荐使用纳秒时间+进程ID+序列号
return String.format("%s-%d-%04d",
LocalDateTime.now().format(DateTimeFormatter.ofPattern("yyyyMMddHHmmss")),
ProcessHandle.current().pid(),
ThreadLocalRandom.current().nextInt(1000));
}
}
步骤2:配置日志输出格式
<!-- logback-spring.xml -->
<configuration>
<!-- 定义MDC的traceId变量 -->
<property name="LOG_PATTERN"
value="%d{yyyy-MM-dd HH:mm:ss.SSS} [%X{traceId}] [%thread] %-5level %logger{36} - %msg%n"/>
<appender name="CONSOLE" class="ch.qos.logback.core.ConsoleAppender">
<encoder>
<pattern>${LOG_PATTERN}</pattern>
</encoder>
</appender>
<root level="INFO">
<appender-ref ref="CONSOLE" />
</root>
</configuration>
步骤3:见证魔法时刻
配置完成后,你的日志会自动变成这样:
# 配置前(混乱的日志)
2024-03-20 10:15:23.123 INFO [http-nio-8080-exec-1] c.e.s.OrderService : 用户下单
2024-03-20 10:15:23.145 INFO [http-nio-8080-exec-3] c.e.s.PaymentService : 调用支付
# 配置后(清晰的链路)
2024-03-20 10:15:23.123 [20240320101523-1234-0427] [http-nio-8080-exec-1] c.e.s.OrderService : 用户下单
2024-03-20 10:15:23.145 [20240320101523-1234-0427] [http-nio-8080-exec-3] c.e.s.PaymentService : 调用支付
现在你可以通过traceId=20240320101523-1234-0427轻松过滤出同一个请求的所有日志!
2.2 增强版:支持从上游接收TraceId
在实际微服务场景中,请求可能来自上游服务,我们需要支持TraceId的传递:
@Component
public class EnhancedTraceFilter implements Filter {
private static final String[] TRACE_ID_HEADERS = {
"X-Trace-Id", // 自定义标准
"X-B3-TraceId", // Zipkin标准
"traceId", // 其他常见格式
"X-Request-Id" // 阿里云等云厂商
};
@Override
public void doFilter(ServletRequest request, ServletResponse response,
FilterChain chain) throws IOException, ServletException {
HttpServletRequest httpRequest = (HttpServletRequest) request;
// 1. 尝试从请求头获取traceId
String traceId = extractTraceIdFromHeaders(httpRequest);
// 2. 如果不存在,则生成新的
if (traceId == null || traceId.trim().isEmpty()) {
traceId = generateTraceId();
}
try {
// 3. 放入MDC
MDC.put("traceId", traceId);
// 4. 还可以放入其他有用信息
MDC.put("userId", extractUserId(httpRequest));
MDC.put("clientIp", httpRequest.getRemoteAddr());
MDC.put("requestURI", httpRequest.getRequestURI());
MDC.put("userAgent", httpRequest.getHeader("User-Agent"));
// 5. 继续处理
chain.doFilter(request, response);
} finally {
MDC.clear();
}
}
private String extractTraceIdFromHeaders(HttpServletRequest request) {
for (String header : TRACE_ID_HEADERS) {
String traceId = request.getHeader(header);
if (traceId != null && !traceId.trim().isEmpty()) {
return traceId.trim();
}
}
return null;
}
private String extractUserId(HttpServletRequest request) {
// 从token或session中提取用户ID
// 这里简化处理,实际项目需要根据认证方式实现
String authHeader = request.getHeader("Authorization");
if (authHeader != null && authHeader.startsWith("Bearer ")) {
// 解析JWT token获取userId
// return jwtUtil.extractUserId(authHeader.substring(7));
}
return "anonymous";
}
}
第三章:高级场景------跨越线程边界
3.1 问题:异步任务中的MDC丢失
这是MDC使用中最常见的问题:
@Service
public class OrderService {
@Async // 这个方法会在新线程中执行
public void asyncProcessOrder(Order order) {
// 这里MDC为空!因为切换了线程
log.info("异步处理订单: {}", order.getId()); // 没有traceId!
}
}
3.2 解决方案:MDC的线程间传递
方案1:手动传递(简单场景)
@Service
public class OrderService {
@Async
public void asyncProcessOrder(Order order) {
// 在异步方法开始时恢复MDC
Map<String, String> context = MDC.getCopyOfContextMap();
if (context != null) {
MDC.setContextMap(context);
}
try {
log.info("异步处理订单: {}", order.getId()); // 现在有traceId了!
// ... 业务逻辑
} finally {
MDC.clear();
}
}
}
方案2:使用TaskDecorator(Spring推荐)
@Configuration
@EnableAsync
public class AsyncConfig implements AsyncConfigurer {
@Override
public Executor getAsyncExecutor() {
ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
executor.setCorePoolSize(10);
executor.setMaxPoolSize(50);
executor.setQueueCapacity(100);
executor.setThreadNamePrefix("Async-");
// 关键:设置TaskDecorator
executor.setTaskDecorator(new MdcTaskDecorator());
executor.initialize();
return executor;
}
}
// MDC任务装饰器
public class MdcTaskDecorator implements TaskDecorator {
@Override
public Runnable decorate(Runnable runnable) {
// 获取当前线程的MDC上下文
Map<String, String> contextMap = MDC.getCopyOfContextMap();
return () -> {
try {
// 异步任务执行前:恢复MDC上下文
if (contextMap != null) {
MDC.setContextMap(contextMap);
}
// 还可以添加异步任务特有的标识
MDC.put("async", "true");
MDC.put("asyncThread", Thread.currentThread().getName());
// 执行原始任务
runnable.run();
} finally {
// 异步任务执行后:清理MDC
MDC.clear();
}
};
}
}
方案3:使用CompletableFuture(Java 8+)
@Service
public class OrderService {
public CompletableFuture<OrderResult> processOrderAsync(Order order) {
// 捕获当前MDC上下文
Map<String, String> mdcContext = MDC.getCopyOfContextMap();
return CompletableFuture.supplyAsync(() -> {
try {
// 恢复MDC
if (mdcContext != null) {
MDC.setContextMap(mdcContext);
}
log.info("开始异步处理订单: {}", order.getId());
OrderResult result = doProcessOrder(order);
log.info("异步订单处理完成: {}", order.getId());
return result;
} finally {
MDC.clear();
}
});
}
}
方案4:通用线程池包装器
public class MdcAwareThreadPoolExecutor extends ThreadPoolExecutor {
public MdcAwareThreadPoolExecutor(int corePoolSize, int maximumPoolSize,
long keepAliveTime, TimeUnit unit,
BlockingQueue<Runnable> workQueue) {
super(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue);
}
@Override
public void execute(Runnable command) {
// 包装任务以传递MDC
super.execute(wrap(command));
}
@Override
public <T> Future<T> submit(Callable<T> task) {
// 包装任务以传递MDC
return super.submit(wrap(task));
}
private Runnable wrap(Runnable runnable) {
Map<String, String> context = MDC.getCopyOfContextMap();
return () -> {
Map<String, String> previous = MDC.getCopyOfContextMap();
try {
if (context != null) {
MDC.setContextMap(context);
}
runnable.run();
} finally {
if (previous != null) {
MDC.setContextMap(previous);
} else {
MDC.clear();
}
}
};
}
private <T> Callable<T> wrap(Callable<T> task) {
Map<String, String> context = MDC.getCopyOfContextMap();
return () -> {
Map<String, String> previous = MDC.getCopyOfContextMap();
try {
if (context != null) {
MDC.setContextMap(context);
}
return task.call();
} finally {
if (previous != null) {
MDC.setContextMap(previous);
} else {
MDC.clear();
}
}
};
}
}
第四章:微服务场景下的全链路追踪
4.1 跨服务传递TraceId
使用RestTemplate
@Configuration
public class RestTemplateConfig {
@Bean
public RestTemplate restTemplate() {
RestTemplate restTemplate = new RestTemplate();
// 添加拦截器,自动传递TraceId
List<ClientHttpRequestInterceptor> interceptors = restTemplate.getInterceptors();
interceptors.add(new TraceIdRestTemplateInterceptor());
restTemplate.setInterceptors(interceptors);
return restTemplate;
}
}
// TraceId拦截器
public class TraceIdRestTemplateInterceptor implements ClientHttpRequestInterceptor {
private static final String TRACE_ID_HEADER = "X-Trace-Id";
@Override
public ClientHttpResponse intercept(HttpRequest request, byte[] body,
ClientHttpRequestExecution execution) throws IOException {
// 从MDC获取traceId
String traceId = MDC.get("traceId");
if (traceId != null) {
// 添加到请求头
request.getHeaders().add(TRACE_ID_HEADER, traceId);
// 还可以添加其他信息
String userId = MDC.get("userId");
if (userId != null) {
request.getHeaders().add("X-User-Id", userId);
}
}
// 记录请求日志
log.info("调用外部服务: {} {}", request.getMethod(), request.getURI());
// 执行请求
ClientHttpResponse response = execution.execute(request, body);
// 记录响应日志
log.info("外部服务响应: {} {}", response.getStatusCode(), request.getURI());
return response;
}
}
使用FeignClient
@Configuration
public class FeignConfig {
@Bean
public RequestInterceptor traceIdFeignInterceptor() {
return requestTemplate -> {
// 从MDC获取traceId
String traceId = MDC.get("traceId");
if (traceId != null) {
requestTemplate.header("X-Trace-Id", traceId);
}
// 传递用户信息
String userId = MDC.get("userId");
if (userId != null) {
requestTemplate.header("X-User-Id", userId);
}
};
}
}
// 在FeignClient中使用
@FeignClient(name = "payment-service", configuration = FeignConfig.class)
public interface PaymentServiceClient {
@PostMapping("/api/payments")
PaymentResult createPayment(@RequestBody PaymentRequest request);
}
使用WebClient(响应式)
@Component
public class ReactiveTraceFilter implements WebFilter {
@Override
public Mono<Void> filter(ServerWebExchange exchange, WebFilterChain chain) {
return Mono.deferContextual(contextView -> {
// 从请求头获取或生成traceId
String traceId = exchange.getRequest().getHeaders()
.getFirst("X-Trace-Id");
if (traceId == null) {
traceId = generateTraceId();
}
// 放入响应式上下文
return chain.filter(exchange)
.contextWrite(Context.of("traceId", traceId));
});
}
}
// 在WebClient中使用
@Component
public class ExternalServiceClient {
public Mono<String> callExternalService() {
return Mono.deferContextual(contextView -> {
String traceId = contextView.getOrDefault("traceId", "");
return WebClient.create()
.post()
.uri("http://external-service/api")
.header("X-Trace-Id", traceId) // 传递traceId
.retrieve()
.bodyToMono(String.class);
});
}
}
4.2 完整的微服务链路追踪架构
// 1. 网关层:生成/传递TraceId
@Component
public class GatewayTraceFilter implements GlobalFilter, Ordered {
@Override
public Mono<Void> filter(ServerWebExchange exchange, GatewayFilterChain chain) {
ServerHttpRequest request = exchange.getRequest();
// 获取或生成traceId
String traceId = request.getHeaders().getFirst("X-Trace-Id");
if (traceId == null) {
traceId = generateTraceId();
}
// 添加其他追踪信息
String spanId = generateSpanId();
String parentSpanId = request.getHeaders().getFirst("X-Parent-Span-Id");
// 构建新的请求,添加追踪头
ServerHttpRequest newRequest = request.mutate()
.header("X-Trace-Id", traceId)
.header("X-Span-Id", spanId)
.header("X-Parent-Span-Id", parentSpanId != null ? parentSpanId : "")
.header("X-Service-Name", "gateway")
.build();
// 记录入口日志
log.info("网关接收请求: {} {}, traceId: {}",
request.getMethod(), request.getURI(), traceId);
return chain.filter(exchange.mutate().request(newRequest).build())
.doOnSuccessOrError((v, e) -> {
// 记录出口日志
if (e != null) {
log.error("网关处理异常: {}, traceId: {}",
request.getURI(), traceId, e);
} else {
log.info("网关处理完成: {} {}, traceId: {}",
request.getMethod(), request.getURI(), traceId);
}
});
}
@Override
public int getOrder() {
return Ordered.HIGHEST_PRECEDENCE;
}
}
// 2. 微服务通用追踪组件
@Aspect
@Component
@Slf4j
public class ServiceTracingAspect {
@Around("@within(org.springframework.web.bind.annotation.RestController) || " +
"@within(org.springframework.stereotype.Service)")
public Object traceServiceMethod(ProceedingJoinPoint joinPoint) throws Throwable {
String className = joinPoint.getSignature().getDeclaringType().getSimpleName();
String methodName = joinPoint.getSignature().getName();
String traceId = MDC.get("traceId");
String spanId = MDC.get("spanId");
long startTime = System.currentTimeMillis();
log.info("开始执行: {}.{}, traceId: {}, spanId: {}",
className, methodName, traceId, spanId);
try {
Object result = joinPoint.proceed();
long duration = System.currentTimeMillis() - startTime;
log.info("执行成功: {}.{}, 耗时: {}ms, traceId: {}",
className, methodName, duration, traceId);
return result;
} catch (Exception e) {
long duration = System.currentTimeMillis() - startTime;
log.error("执行失败: {}.{}, 耗时: {}ms, traceId: {}",
className, methodName, duration, traceId, e);
throw e;
}
}
}
// 3. 数据库操作追踪
@Aspect
@Component
@Slf4j
public class DatabaseTracingAspect {
@Around("execution(* org.springframework.data.repository.Repository+.*(..))")
public Object traceRepositoryMethod(ProceedingJoinPoint joinPoint) throws Throwable {
String methodName = joinPoint.getSignature().getName();
String traceId = MDC.get("traceId");
// 记录SQL执行开始
if (log.isDebugEnabled()) {
Object[] args = joinPoint.getArgs();
log.debug("SQL开始: {}, 参数: {}, traceId: {}",
methodName, Arrays.toString(args), traceId);
}
long startTime = System.nanoTime();
try {
Object result = joinPoint.proceed();
long duration = TimeUnit.NANOSECONDS.toMillis(System.nanoTime() - startTime);
// 慢SQL检测
if (duration > 100) { // 超过100ms
log.warn("慢SQL警告: {}, 耗时: {}ms, traceId: {}",
methodName, duration, traceId);
}
if (log.isDebugEnabled()) {
log.debug("SQL完成: {}, 耗时: {}ms, traceId: {}",
methodName, duration, traceId);
}
return result;
} catch (Exception e) {
long duration = TimeUnit.NANOSECONDS.toMillis(System.nanoTime() - startTime);
log.error("SQL异常: {}, 耗时: {}ms, traceId: {}",
methodName, duration, traceId, e);
throw e;
}
}
}
第五章:实战案例------电商系统全链路追踪
5.1 完整配置示例
logback-spring.xml 完整配置:
<?xml version="1.0" encoding="UTF-8"?>
<configuration scan="true" scanPeriod="30 seconds">
<!-- 定义变量 -->
<property name="LOG_PATH" value="./logs" />
<property name="APP_NAME" value="ecommerce-service" />
<!-- 日志格式:包含完整的追踪信息 -->
<property name="LOG_PATTERN"
value="%d{yyyy-MM-dd HH:mm:ss.SSS} [%X{traceId}] [%X{spanId}] [%X{userId}] [%thread] %-5level %logger{40} - %msg%n"/>
<!-- 控制台输出 -->
<appender name="CONSOLE" class="ch.qos.logback.core.ConsoleAppender">
<encoder>
<pattern>${LOG_PATTERN}</pattern>
</encoder>
</appender>
<!-- 按天滚动文件 -->
<appender name="FILE" class="ch.qos.logback.core.rolling.RollingFileAppender">
<file>${LOG_PATH}/${APP_NAME}.log</file>
<encoder>
<pattern>${LOG_PATTERN}</pattern>
</encoder>
<rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
<fileNamePattern>${LOG_PATH}/${APP_NAME}.%d{yyyy-MM-dd}.log</fileNamePattern>
<maxHistory>30</maxHistory>
<totalSizeCap>3GB</totalSizeCap>
</rollingPolicy>
</appender>
<!-- 按traceId分离日志(用于调试) -->
<appender name="TRACE_FILE" class="ch.qos.logback.core.rolling.RollingFileAppender">
<file>${LOG_PATH}/traces/trace-${BY_TRACEID}.log</file>
<encoder>
<pattern>%d{yyyy-MM-dd HH:mm:ss.SSS} [%thread] %-5level - %msg%n</pattern>
</encoder>
<rollingPolicy class="ch.qos.logback.core.rolling.FixedWindowRollingPolicy">
<fileNamePattern>${LOG_PATH}/traces/trace-${BY_TRACEID}.%i.log</fileNamePattern>
<minIndex>1</minIndex>
<maxIndex>3</maxIndex>
</rollingPolicy>
<triggeringPolicy class="ch.qos.logback.core.rolling.SizeBasedTriggeringPolicy">
<maxFileSize>10MB</maxFileSize>
</triggeringPolicy>
<filter class="ch.qos.logback.core.filter.EvaluatorFilter">
<evaluator class="ch.qos.logback.classic.boolex.OnMarkerEvaluator">
<marker>TRACE_LOG</marker>
</evaluator>
<onMatch>ACCEPT</onMatch>
<onMismatch>DENY</onMismatch>
</filter>
</appender>
<!-- 异步日志提升性能 -->
<appender name="ASYNC_FILE" class="ch.qos.logback.classic.AsyncAppender">
<discardingThreshold>0</discardingThreshold>
<queueSize>512</queueSize>
<appender-ref ref="FILE" />
</appender>
<!-- 日志级别配置 -->
<logger name="com.ecommerce" level="DEBUG" additivity="false">
<appender-ref ref="CONSOLE" />
<appender-ref ref="ASYNC_FILE" />
</logger>
<!-- 特定追踪日志 -->
<logger name="TRACE_LOGGER" level="DEBUG" additivity="false">
<appender-ref ref="TRACE_FILE" />
</logger>
<root level="INFO">
<appender-ref ref="CONSOLE" />
<appender-ref ref="ASYNC_FILE" />
</root>
</configuration>
5.2 追踪工具类
@Component
@Slf4j
public class TraceUtils {
/**
* 记录关键业务节点
*/
public static void trace(String event, Object... args) {
String traceId = MDC.get("traceId");
String userId = MDC.get("userId");
if (traceId != null) {
// 使用MDC标记,可以被TRACE_FILE appender捕获
org.slf4j.Marker marker = MarkerFactory.getMarker("TRACE_LOG");
log.info(marker, "业务追踪 - {} | userId: {} | traceId: {} | 参数: {}",
event, userId, traceId, args);
} else {
log.info("业务追踪 - {} | 参数: {}", event, args);
}
}
/**
* 开始一个业务阶段
*/
public static TraceSpan startPhase(String phaseName) {
String spanId = generateSpanId();
String parentSpanId = MDC.get("spanId");
// 保存旧的spanId
String oldSpanId = MDC.get("spanId");
// 设置新的spanId
MDC.put("spanId", spanId);
MDC.put("phase", phaseName);
long startTime = System.currentTimeMillis();
trace("阶段开始", "phase", phaseName, "spanId", spanId,
"parentSpanId", parentSpanId);
return new TraceSpan(phaseName, spanId, parentSpanId, startTime, oldSpanId);
}
/**
* 结束一个业务阶段
*/
public static void endPhase(TraceSpan span) {
long duration = System.currentTimeMillis() - span.getStartTime();
// 恢复旧的spanId
if (span.getOldSpanId() != null) {
MDC.put("spanId", span.getOldSpanId());
} else {
MDC.remove("spanId");
}
MDC.remove("phase");
trace("阶段结束", "phase", span.getPhaseName(),
"spanId", span.getSpanId(), "duration", duration + "ms");
}
/**
* 在日志中可视化调用链
*/
public static void visualizeCallChain() {
String traceId = MDC.get("traceId");
String spanId = MDC.get("spanId");
String phase = MDC.get("phase");
if (traceId != null) {
StringBuilder chain = new StringBuilder();
chain.append("\n=== 调用链追踪 ===\n");
chain.append("Trace ID: ").append(traceId).append("\n");
chain.append("当前 Span: ").append(spanId).append("\n");
chain.append("当前阶段: ").append(phase).append("\n");
chain.append("线程: ").append(Thread.currentThread().getName()).append("\n");
chain.append("时间: ").append(LocalDateTime.now()).append("\n");
chain.append("==================\n");
log.debug(chain.toString());
}
}
public static class TraceSpan {
private final String phaseName;
private final String spanId;
private final String parentSpanId;
private final long startTime;
private final String oldSpanId;
// 构造方法、getter省略
}
}
5.3 在业务代码中使用
@Service
@Slf4j
public class OrderServiceImpl implements OrderService {
@Autowired
private PaymentService paymentService;
@Autowired
private InventoryService inventoryService;
@Autowired
private NotificationService notificationService;
@Override
@Transactional
public OrderResult createOrder(CreateOrderRequest request) {
// 开始订单创建阶段
TraceUtils.TraceSpan orderSpan = TraceUtils.startPhase("创建订单");
try {
// 记录关键信息
TraceUtils.trace("收到创建订单请求",
"userId", request.getUserId(),
"productId", request.getProductId(),
"quantity", request.getQuantity());
// 1. 检查库存
TraceUtils.TraceSpan inventorySpan = TraceUtils.startPhase("检查库存");
try {
inventoryService.checkInventory(request.getProductId(), request.getQuantity());
TraceUtils.trace("库存检查通过");
} finally {
TraceUtils.endPhase(inventorySpan);
}
// 2. 创建订单记录
Order order = saveOrder(request);
TraceUtils.trace("订单记录创建", "orderId", order.getId());
// 3. 调用支付
TraceUtils.TraceSpan paymentSpan = TraceUtils.startPhase("支付处理");
try {
PaymentResult paymentResult = paymentService.processPayment(
order.getId(), request.getPaymentMethod(), order.getTotalAmount());
TraceUtils.trace("支付成功",
"paymentId", paymentResult.getPaymentId(),
"amount", order.getTotalAmount());
} finally {
TraceUtils.endPhase(paymentSpan);
}
// 4. 减少库存
inventoryService.reduceInventory(request.getProductId(), request.getQuantity());
TraceUtils.trace("库存扣减完成");
// 5. 发送通知(异步)
TraceUtils.TraceSpan notificationSpan = TraceUtils.startPhase("发送通知");
try {
notificationService.sendOrderCreatedNotification(order.getId(), request.getUserId());
TraceUtils.trace("通知发送任务已提交");
} finally {
TraceUtils.endPhase(notificationSpan);
}
// 可视化当前调用链(调试用)
TraceUtils.visualizeCallChain();
return new OrderResult(true, "订单创建成功", order.getId());
} catch (Exception e) {
TraceUtils.trace("订单创建失败", "error", e.getMessage());
log.error("创建订单异常", e);
throw new BusinessException("订单创建失败", e);
} finally {
TraceUtils.endPhase(orderSpan);
}
}
}
第六章:性能优化与最佳实践
6.1 MDC性能影响测试
@SpringBootTest
public class MDCPerformanceTest {
@Test
public void testMDCOverhead() {
int iterations = 1000000;
// 测试无MDC的日志记录
long startWithoutMDC = System.nanoTime();
for (int i = 0; i < iterations; i++) {
log.info("测试日志 without MDC, iteration: {}", i);
}
long timeWithoutMDC = System.nanoTime() - startWithoutMDC;
// 测试有MDC的日志记录
long startWithMDC = System.nanoTime();
for (int i = 0; i < iterations; i++) {
MDC.put("traceId", "test-trace-" + i);
MDC.put("userId", "user-" + i);
log.info("测试日志 with MDC, iteration: {}", i);
MDC.clear();
}
long timeWithMDC = System.nanoTime() - startWithMDC;
double overhead = (double) (timeWithMDC - timeWithoutMDC) / timeWithoutMDC * 100;
System.out.println("========== MDC性能测试结果 ==========");
System.out.println("无MDC耗时: " + timeWithoutMDC / 1000000 + "ms");
System.out.println("有MDC耗时: " + timeWithMDC / 1000000 + "ms");
System.out.println("性能开销: " + String.format("%.2f", overhead) + "%");
System.out.println("每次操作额外开销: " +
(timeWithMDC - timeWithoutMDC) / iterations + "ns");
}
@Test
public void testThreadLocalMemoryLeak() throws Exception {
// 模拟线程池场景
ExecutorService executor = Executors.newFixedThreadPool(10);
List<Future<?>> futures = new ArrayList<>();
for (int i = 0; i < 1000; i++) {
futures.add(executor.submit(() -> {
MDC.put("traceId", UUID.randomUUID().toString());
// 模拟业务处理
try {
Thread.sleep(10);
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
// 忘记清理MDC!
// MDC.clear(); // 故意注释掉,模拟内存泄漏
}));
}
// 等待所有任务完成
for (Future<?> future : futures) {
future.get();
}
// 强制GC
System.gc();
Thread.sleep(1000);
// 检查内存
Runtime runtime = Runtime.getRuntime();
long usedMemory = runtime.totalMemory() - runtime.freeMemory();
System.out.println("内存使用: " + usedMemory / 1024 / 1024 + "MB");
executor.shutdown();
}
}
6.2 最佳实践总结
✅ 一定要做的:
-
始终清理MDC
try { MDC.put("traceId", traceId); // 业务逻辑 } finally { MDC.clear(); // 必须清理! } -
使用try-with-resources模式
public class MDCContext implements AutoCloseable { private final Map<String, String> previous; public MDCContext(String key, String value) { this.previous = MDC.getCopyOfContextMap(); MDC.put(key, value); } @Override public void close() { if (previous != null) { MDC.setContextMap(previous); } else { MDC.clear(); } } } // 使用 try (MDCContext ctx = new MDCContext("traceId", "123")) { // 业务逻辑 } // 自动清理 -
异步任务必须传递MDC
@Async public void asyncTask() { // 错误:直接开始业务逻辑 // 正确:先恢复MDC Map<String, String> context = MDC.getCopyOfContextMap(); if (context != null) { MDC.setContextMap(context); } try { // 业务逻辑 } finally { MDC.clear(); } }
❌ 绝对要避免的:
-
不要存储大对象到MDC
// 错误! MDC.put("largeObject", largeObject.toString()); // 可能很大 // 正确:只存ID或摘要 MDC.put("objectId", largeObject.getId()); -
不要在多线程间共享MDC实例
// 错误! Map<String, String> mdcMap = MDC.getCopyOfContextMap(); executor.submit(() -> { MDC.setContextMap(mdcMap); // 可能被其他线程修改 }); -
避免频繁修改MDC
// 低效 for (int i = 0; i < 1000; i++) { MDC.put("iteration", String.valueOf(i)); log.info("Processing"); MDC.remove("iteration"); } // 高效:批量处理 MDC.put("batch", "true"); for (int i = 0; i < 1000; i++) { log.info("Processing {}", i); } MDC.remove("batch");
6.3 生产环境配置建议
# application-prod.yml
logging:
level:
com.yourcompany: INFO
pattern:
# 生产环境:简洁格式,包含必要追踪信息
console: "%d{yyyy-MM-dd HH:mm:ss} [%X{traceId}] %-5level %logger{20} - %msg%n"
file: "%d{yyyy-MM-dd HH:mm:ss} [%X{traceId}] [%X{userId}] [%thread] %-5level %logger{30} - %msg%n"
file:
name: /var/log/app/app.log
max-size: 100MB
max-history: 30
# 追踪配置
tracing:
enabled: true
# 只对慢请求记录详细日志
slow-request-threshold: 1000ms
# 采样率:生产环境可降低
sampling-rate: 0.1 # 10%的请求记录详细追踪
# 忽略的健康检查等端点
exclude-paths:
- /health
- /metrics
- /actuator/**
第七章:常见问题与解决方案
7.1 MDC在WebFlux中失效怎么办?
WebFlux使用响应式编程,不保证线程连续性,MDC无法直接使用。
解决方案:使用Context
@Component
public class ReactiveTraceFilter implements WebFilter {
@Override
public Mono<Void> filter(ServerWebExchange exchange, WebFilterChain chain) {
String traceId = exchange.getRequest().getHeaders()
.getFirst("X-Trace-Id");
if (traceId == null) {
traceId = generateTraceId();
}
// 使用Reactor的Context
return chain.filter(exchange)
.contextWrite(Context.of("traceId", traceId))
.doOnEach(signal -> {
if (signal.isOnNext()) {
// 记录日志时从Context获取traceId
String tid = signal.getContextView().getOrDefault("traceId", "");
log.info("Request processed, traceId: {}", tid);
}
});
}
}
// 在业务代码中获取
@Service
public class OrderService {
public Mono<Order> createOrder(OrderRequest request) {
return Mono.deferContextual(contextView -> {
String traceId = contextView.getOrDefault("traceId", "");
return Mono.just(request)
.doOnNext(req -> log.info("创建订单, traceId: {}", traceId))
.flatMap(this::processOrder);
});
}
}
7.2 如何与现有监控系统集成?
@Component
public class MetricsIntegration {
private final MeterRegistry meterRegistry;
// 集成Micrometer
public void recordTraceMetrics(String traceId, long duration, boolean success) {
Tags tags = Tags.of(
"traceId", traceId,
"success", String.valueOf(success)
);
Timer timer = Timer.builder("request.duration")
.tags(tags)
.register(meterRegistry);
timer.record(duration, TimeUnit.MILLISECONDS);
}
// 集成Prometheus
public void exposeTraceInfo() {
Gauge.builder("tracing.active_requests",
() -> getActiveTraceCount())
.description("当前活跃的追踪请求数量")
.register(meterRegistry);
}
// 集成Jaeger/Zipkin(如果已使用)
public void integrateWithDistributedTracing() {
// 可以将MDC中的traceId与分布式追踪系统的traceId关联
String mdcTraceId = MDC.get("traceId");
// 设置到分布式追踪上下文
// Tracing.currentSpan().setTag("mdc.traceId", mdcTraceId);
}
}
7.3 日志脱敏与安全
@Component
public class SecureTraceFilter implements Filter {
@Override
public void doFilter(ServletRequest request, ServletResponse response,
FilterChain chain) throws IOException, ServletException {
HttpServletRequest httpRequest = (HttpServletRequest) request;
try {
// 设置追踪信息
MDC.put("traceId", generateTraceId());
MDC.put("requestId", httpRequest.getRequestId());
MDC.put("sessionId", maskSensitive(httpRequest.getSession().getId()));
MDC.put("clientIp", httpRequest.getRemoteAddr());
// 脱敏处理
String authHeader = httpRequest.getHeader("Authorization");
if (authHeader != null && authHeader.startsWith("Bearer ")) {
MDC.put("tokenHash", hashToken(authHeader.substring(7)));
}
chain.doFilter(request, response);
} finally {
MDC.clear();
}
}
private String maskSensitive(String value) {
if (value == null || value.length() <= 8) {
return "***";
}
return value.substring(0, 3) + "***" + value.substring(value.length() - 3);
}
private String hashToken(String token) {
try {
MessageDigest md = MessageDigest.getInstance("SHA-256");
byte[] hash = md.digest(token.getBytes(StandardCharsets.UTF_8));
return Base64.getEncoder().encodeToString(hash).substring(0, 16);
} catch (Exception e) {
return "hash-error";
}
}
}
总结
通过MDC实现请求链路追踪,我们实现了:
🎯 核心价值
-
快速定位问题:通过traceId一键定位所有相关日志
-
分析性能瓶颈:清晰看到请求在每个阶段的耗时
-
改善协作效率:运维、开发、测试使用统一的追踪标识
🛠️ 关键技术点
-
MDC设置与清理:一定要在finally块中清理
-
线程间传递:异步任务必须手动传递MDC上下文
-
微服务传递:通过HTTP头在服务间传递traceId
-
安全脱敏:敏感信息不能直接放入MDC
📈 演进路线
对于大多数应用,建议按以下阶段实施:
阶段1 :基础MDC追踪(单体应用) → 阶段2 :异步支持 → 阶段3 :微服务传递 → 阶段4:与APM系统集成
MDC虽小,却能解决日志追踪的大问题。从今天开始,让你的SpringBoot应用日志告别混乱,迎接清晰的可观测性新时代!