全链路追踪:OpenTelemetry与Jaeger实战
大家好,我是欧阳瑞(Rich Own)。今天想和大家聊聊全链路追踪这个重要话题。作为一个全栈开发者,在微服务架构中,全链路追踪是定位问题和性能优化的关键工具。今天就来分享一下OpenTelemetry和Jaeger的实战经验。
全链路追踪概述
为什么需要全链路追踪?
| 问题 | 说明 |
|---|---|
| 分布式系统复杂 | 调用链路过长难以追踪 |
| 性能瓶颈难定位 | 无法确定哪个服务慢 |
| 故障排查困难 | 错误传播难以追踪 |
| 用户体验监控 | 需要端到端延迟分析 |
核心概念
| 概念 | 说明 |
|---|---|
| Trace | 一次完整的请求链路 |
| Span | 链路中的单个操作 |
| Span Context | 跨服务传递的上下文 |
| Baggage | 用户自定义数据 |
OpenTelemetry入门
安装依赖
bash
npm install @opentelemetry/sdk-node @opentelemetry/api
npm install @opentelemetry/exporter-jaeger @opentelemetry/auto-instrumentations-node
基本配置
javascript
// tracing.js
const { NodeSDK } = require('@opentelemetry/sdk-node');
const { ConsoleSpanExporter } = require('@opentelemetry/sdk-trace-base');
const { JaegerExporter } = require('@opentelemetry/exporter-jaeger');
const { getNodeAutoInstrumentations } = require('@opentelemetry/auto-instrumentations-node');
const sdk = new NodeSDK({
traceExporter: new JaegerExporter({
endpoint: 'http://localhost:14268/api/traces'
}),
instrumentations: [getNodeAutoInstrumentations()]
});
sdk.start();
手动创建Span
javascript
const { trace, context } = require('@opentelemetry/api');
const tracer = trace.getTracer('my-service');
async function processRequest(req) {
return tracer.startActiveSpan('process-request', async (span) => {
span.setAttribute('request.id', req.id);
try {
const user = await fetchUser(req.userId);
const order = await createOrder(user);
return order;
} finally {
span.end();
}
});
}
Jaeger配置
启动Jaeger
yaml
# docker-compose.yml
version: '3.7'
services:
jaeger:
image: jaegertracing/all-in-one:latest
ports:
- "5775:5775/udp"
- "6831:6831/udp"
- "6832:6832/udp"
- "5778:5778"
- "16686:16686"
- "14268:14268"
- "9411:9411"
environment:
- COLLECTOR_ZIPKIN_HOST_PORT=:9411
访问Jaeger UI
bash
# 启动服务
docker-compose up -d
# 访问仪表盘
open http://localhost:16686
实战案例:分布式追踪
javascript
// 服务A
const tracer = trace.getTracer('service-a');
async function handleRequest(req) {
return tracer.startActiveSpan('handle-request', async (span) => {
span.setAttribute('service', 'service-a');
// 调用服务B
const response = await fetch('http://service-b/api/data', {
headers: {
'traceparent': trace.getSpan(context.active()).spanContext().traceId
}
});
span.end();
return response.json();
});
}
// 服务B
async function handleRequest(req) {
return tracer.startActiveSpan('process-data', async (span) => {
span.setAttribute('service', 'service-b');
// 调用数据库
const data = await db.query('SELECT * FROM users');
span.end();
return data;
});
}
最佳实践
1. 自定义Instrumentation
javascript
const { InstrumentationBase } = require('@opentelemetry/instrumentation');
class MyInstrumentation extends InstrumentationBase {
constructor() {
super('my-instrumentation', '1.0.0');
}
init() {
// 自定义instrumentation逻辑
}
}
2. 采样策略
javascript
const { ParentBasedSampler, AlwaysOnSampler, TraceIdRatioBasedSampler } = require('@opentelemetry/sdk-trace-base');
const sampler = new ParentBasedSampler({
root: new TraceIdRatioBasedSampler(0.1) // 10%采样率
});
3. 指标集成
javascript
const { MeterProvider } = require('@opentelemetry/sdk-metrics');
const meterProvider = new MeterProvider();
const meter = meterProvider.getMeter('my-service');
const requestCounter = meter.createCounter('requests', {
description: 'Total requests'
});
// 使用计数器
requestCounter.add(1, { status: 'success' });
总结
全链路追踪是微服务架构中不可或缺的工具。通过OpenTelemetry和Jaeger,可以实现端到端的追踪和性能分析。
我的鬃狮蜥Hash对追踪也有自己的理解------它总是能追踪到蟋蟀的位置,这也许就是自然界的"全链路追踪"吧!
如果你对全链路追踪有任何问题,欢迎留言交流!我是欧阳瑞,极客之路,永无止境!
技术栈:OpenTelemetry · Jaeger · 全链路追踪