前言
💡 痛点: 微服务拆分困难?服务间通信一团乱麻?分布式事务踩坑无数?链路追踪全靠猜日志?
🎯 解决方案: 从架构设计→服务拆分→通信机制→分布式事务→可观测性→DevOps,系统掌握微服务架构。
#mermaid-svg-beZkLU58Y5pd6C0O{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-beZkLU58Y5pd6C0O .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-beZkLU58Y5pd6C0O .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-beZkLU58Y5pd6C0O .error-icon{fill:#552222;}#mermaid-svg-beZkLU58Y5pd6C0O .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-beZkLU58Y5pd6C0O .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-beZkLU58Y5pd6C0O .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-beZkLU58Y5pd6C0O .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-beZkLU58Y5pd6C0O .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-beZkLU58Y5pd6C0O .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-beZkLU58Y5pd6C0O .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-beZkLU58Y5pd6C0O .marker{fill:#333333;stroke:#333333;}#mermaid-svg-beZkLU58Y5pd6C0O .marker.cross{stroke:#333333;}#mermaid-svg-beZkLU58Y5pd6C0O svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-beZkLU58Y5pd6C0O p{margin:0;}#mermaid-svg-beZkLU58Y5pd6C0O .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-beZkLU58Y5pd6C0O .cluster-label text{fill:#333;}#mermaid-svg-beZkLU58Y5pd6C0O .cluster-label span{color:#333;}#mermaid-svg-beZkLU58Y5pd6C0O .cluster-label span p{background-color:transparent;}#mermaid-svg-beZkLU58Y5pd6C0O .label text,#mermaid-svg-beZkLU58Y5pd6C0O span{fill:#333;color:#333;}#mermaid-svg-beZkLU58Y5pd6C0O .node rect,#mermaid-svg-beZkLU58Y5pd6C0O .node circle,#mermaid-svg-beZkLU58Y5pd6C0O .node ellipse,#mermaid-svg-beZkLU58Y5pd6C0O .node polygon,#mermaid-svg-beZkLU58Y5pd6C0O .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-beZkLU58Y5pd6C0O .rough-node .label text,#mermaid-svg-beZkLU58Y5pd6C0O .node .label text,#mermaid-svg-beZkLU58Y5pd6C0O .image-shape .label,#mermaid-svg-beZkLU58Y5pd6C0O .icon-shape .label{text-anchor:middle;}#mermaid-svg-beZkLU58Y5pd6C0O .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-beZkLU58Y5pd6C0O .rough-node .label,#mermaid-svg-beZkLU58Y5pd6C0O .node .label,#mermaid-svg-beZkLU58Y5pd6C0O .image-shape .label,#mermaid-svg-beZkLU58Y5pd6C0O .icon-shape .label{text-align:center;}#mermaid-svg-beZkLU58Y5pd6C0O .node.clickable{cursor:pointer;}#mermaid-svg-beZkLU58Y5pd6C0O .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-beZkLU58Y5pd6C0O .arrowheadPath{fill:#333333;}#mermaid-svg-beZkLU58Y5pd6C0O .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-beZkLU58Y5pd6C0O .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-beZkLU58Y5pd6C0O .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-beZkLU58Y5pd6C0O .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-beZkLU58Y5pd6C0O .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-beZkLU58Y5pd6C0O .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-beZkLU58Y5pd6C0O .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-beZkLU58Y5pd6C0O .cluster text{fill:#333;}#mermaid-svg-beZkLU58Y5pd6C0O .cluster span{color:#333;}#mermaid-svg-beZkLU58Y5pd6C0O div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-beZkLU58Y5pd6C0O .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-beZkLU58Y5pd6C0O rect.text{fill:none;stroke-width:0;}#mermaid-svg-beZkLU58Y5pd6C0O .icon-shape,#mermaid-svg-beZkLU58Y5pd6C0O .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-beZkLU58Y5pd6C0O .icon-shape p,#mermaid-svg-beZkLU58Y5pd6C0O .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-beZkLU58Y5pd6C0O .icon-shape .label rect,#mermaid-svg-beZkLU58Y5pd6C0O .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-beZkLU58Y5pd6C0O .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-beZkLU58Y5pd6C0O .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-beZkLU58Y5pd6C0O :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} 微服务集群
服务网格
接入层
可观测
基础设施
业务服务
API Gateway
Kong/Envoy
Service Mesh
Istio/Linkerd
User Service
Order Service
Product Service
Eureka/Nacos
注册中心
Apollo/Consul
配置中心
Kafka/RabbitMQ
消息队列
Redis Cluster
缓存中心
Jaeger/Zipkin
链路追踪
ELK Stack
日志聚合
Prometheus+Grafana
指标监控
微服务核心组件一览:
| 组件 | 选型 | 说明 |
|---|---|---|
| 注册中心 | Nacos/Eureka/Consul | 服务注册与发现 |
| 配置中心 | Apollo/Nacos/Spring Cloud Config | 集中配置管理 |
| API 网关 | Kong/Envoy/Zuul/Gateway | 统一入口 + 路由 |
| 服务网格 | Istio/Linkerd | 流量管理 + 安全 |
| 消息队列 | Kafka/RabbitMQ/RocketMQ | 异步解耦 |
| 链路追踪 | Jaeger/Zipkin/SkyWalking | 分布式追踪 |
| 日志聚合 | ELK/Loki | 日志集中管理 |
| 指标监控 | Prometheus+Grafana | 指标采集展示 |
一、微服务架构设计原则
1.1 拆分原则
#mermaid-svg-tneRK4YSmiLGrcd7{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-tneRK4YSmiLGrcd7 .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-tneRK4YSmiLGrcd7 .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-tneRK4YSmiLGrcd7 .error-icon{fill:#552222;}#mermaid-svg-tneRK4YSmiLGrcd7 .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-tneRK4YSmiLGrcd7 .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-tneRK4YSmiLGrcd7 .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-tneRK4YSmiLGrcd7 .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-tneRK4YSmiLGrcd7 .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-tneRK4YSmiLGrcd7 .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-tneRK4YSmiLGrcd7 .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-tneRK4YSmiLGrcd7 .marker{fill:#333333;stroke:#333333;}#mermaid-svg-tneRK4YSmiLGrcd7 .marker.cross{stroke:#333333;}#mermaid-svg-tneRK4YSmiLGrcd7 svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-tneRK4YSmiLGrcd7 p{margin:0;}#mermaid-svg-tneRK4YSmiLGrcd7 .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-tneRK4YSmiLGrcd7 .cluster-label text{fill:#333;}#mermaid-svg-tneRK4YSmiLGrcd7 .cluster-label span{color:#333;}#mermaid-svg-tneRK4YSmiLGrcd7 .cluster-label span p{background-color:transparent;}#mermaid-svg-tneRK4YSmiLGrcd7 .label text,#mermaid-svg-tneRK4YSmiLGrcd7 span{fill:#333;color:#333;}#mermaid-svg-tneRK4YSmiLGrcd7 .node rect,#mermaid-svg-tneRK4YSmiLGrcd7 .node circle,#mermaid-svg-tneRK4YSmiLGrcd7 .node ellipse,#mermaid-svg-tneRK4YSmiLGrcd7 .node polygon,#mermaid-svg-tneRK4YSmiLGrcd7 .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-tneRK4YSmiLGrcd7 .rough-node .label text,#mermaid-svg-tneRK4YSmiLGrcd7 .node .label text,#mermaid-svg-tneRK4YSmiLGrcd7 .image-shape .label,#mermaid-svg-tneRK4YSmiLGrcd7 .icon-shape .label{text-anchor:middle;}#mermaid-svg-tneRK4YSmiLGrcd7 .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-tneRK4YSmiLGrcd7 .rough-node .label,#mermaid-svg-tneRK4YSmiLGrcd7 .node .label,#mermaid-svg-tneRK4YSmiLGrcd7 .image-shape .label,#mermaid-svg-tneRK4YSmiLGrcd7 .icon-shape .label{text-align:center;}#mermaid-svg-tneRK4YSmiLGrcd7 .node.clickable{cursor:pointer;}#mermaid-svg-tneRK4YSmiLGrcd7 .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-tneRK4YSmiLGrcd7 .arrowheadPath{fill:#333333;}#mermaid-svg-tneRK4YSmiLGrcd7 .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-tneRK4YSmiLGrcd7 .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-tneRK4YSmiLGrcd7 .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-tneRK4YSmiLGrcd7 .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-tneRK4YSmiLGrcd7 .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-tneRK4YSmiLGrcd7 .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-tneRK4YSmiLGrcd7 .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-tneRK4YSmiLGrcd7 .cluster text{fill:#333;}#mermaid-svg-tneRK4YSmiLGrcd7 .cluster span{color:#333;}#mermaid-svg-tneRK4YSmiLGrcd7 div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-tneRK4YSmiLGrcd7 .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-tneRK4YSmiLGrcd7 rect.text{fill:none;stroke-width:0;}#mermaid-svg-tneRK4YSmiLGrcd7 .icon-shape,#mermaid-svg-tneRK4YSmiLGrcd7 .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-tneRK4YSmiLGrcd7 .icon-shape p,#mermaid-svg-tneRK4YSmiLGrcd7 .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-tneRK4YSmiLGrcd7 .icon-shape .label rect,#mermaid-svg-tneRK4YSmiLGrcd7 .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-tneRK4YSmiLGrcd7 .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-tneRK4YSmiLGrcd7 .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-tneRK4YSmiLGrcd7 :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} 领域驱动设计
异步事件
异步事件
Bounded Context
边界上下文
User Experience
用户体验
Organizational
组织架构
Technology
技术差异
服务A
User Context
服务B
Order Context
服务C
Payment Context
python
# ===== 服务拆分原则 =====
"""
拆分维度:
1. 单一职责原则(SRP)
每个服务只负责一个业务能力
✅ User Service(用户注册/登录/认证)
❌ User + Order + Payment 打包在一起
2. 领域驱动设计(DDD)
按业务边界(Bounded Context)拆分
- 用户上下文:用户信息、认证授权
- 订单上下文:订单创建、状态流转
- 支付上下文:支付、结算、对账
3. 独立部署与扩展
每个服务可独立部署、独立扩展
高并发模块可独立扩展
4. 康威定律
团队结构决定系统架构
小团队 → 小服务 → 快迭代
"""
# ===== 拆分粒度判断 =====
# 过粗:巨石应用(Monolithic)
# 特征:单代码库、数百万行代码、部署一次全部更新
# 问题:部署时间长、扩展不灵活、技术栈统一
# 过细:微服务过多(Microphallacy)
# 特征:100+ 个服务、每个服务 2-3 个接口
# 问题:运维复杂度爆炸、分布式事务蔓延、网络开销
# 适中:服务数量 = 团队规模 × 2~3
# 经验:
# 5 人团队 → 10~15 个服务
# 10 人团队 → 20~30 个服务
# 每个服务 2~5 人负责(熟悉度最佳)
# ===== 典型微服务拆分示例 =====
"""
电商平台拆分:
用户域:
- user-service: 用户注册/登录/认证
- auth-service: OAuth2 / JWT 令牌服务
- account-service: 账户、积分、等级
商品域:
- product-service: 商品信息管理
- inventory-service: 库存管理
- search-service: 商品搜索(ES)
订单域:
- order-service: 订单创建/修改
- cart-service: 购物车
- price-service: 价格计算
支付域:
- payment-service: 支付网关集成
- settlement-service: 结算、对账
- invoice-service: 发票
营销域:
- promotion-service: 优惠活动
- coupon-service: 优惠券
- seckill-service: 秒杀
基础设施域:
- gateway-service: API 网关
- notification-service: 通知服务
- file-service: 文件存储
"""
# ===== 拆分过渡策略 =====
# 策略1: 绞肉机(Strangler Fig)
# 旧系统逐步替换为新服务,边运行边迁移
# 1. 在旧系统外层加 API Gateway
# 2. 新功能用新服务实现
# 3. 旧功能逐步迁移到新服务
# 4. 最终旧系统下线
# 策略2: 领域优先,数据同步
# 先拆分数据,服务内聚
# 1. 识别核心领域
# 2. 新服务读写自己数据
# 3. 旧系统数据通过 CDC(Debezium)同步
# 4. 切读流量到新服务
1.2 服务间同步与异步
python
# ===== 同步通信(REST / gRPC)=====
# REST API(通用,简单易用)
import requests
class OrderService:
def __init__(self, user_service_url):
self.user_url = user_service_url
def create_order(self, user_id, product_id):
# 同步调用用户服务(验证用户是否存在)
user = requests.get(f'{self.user_url}/users/{user_id}')
if user.status_code != 200:
raise ValueError('User not found')
# 创建订单
order = self.db.create(...)
return order
# gRPC(高性能,二进制协议,适合内部调用)
# order_service.proto
# syntax = "proto3";
# package order;
#
# service OrderService {
# rpc CreateOrder(CreateOrderRequest) returns (OrderResponse);
# rpc GetOrder(GetOrderRequest) returns (Order);
# }
#
# message CreateOrderRequest {
# int64 user_id = 1;
# repeated OrderItem items = 2;
# }
#
# message OrderResponse {
# Order order = 1;
# string message = 2;
# }
# ===== 异步通信(消息队列)=====
from kafka import KafkaProducer, KafkaConsumer
# 生产者:发布领域事件
producer = KafkaProducer(
bootstrap_servers=['kafka:9092'],
value_serializer=lambda v: json.dumps(v).encode('utf-8')
)
def create_order(user_id, items):
order = db.create_order(user_id, items)
# 发布订单创建事件(异步,不阻塞)
producer.send('order.created', {
'event_id': str(uuid.uuid4()),
'event_type': 'OrderCreated',
'order_id': order.id,
'user_id': user_id,
'items': items,
'timestamp': datetime.now().isoformat()
})
producer.flush()
return order # 立即返回,不等待其他服务
# 消费者:监听事件处理
consumer = KafkaConsumer(
'order.created',
bootstrap_servers=['kafka:9092'],
group_id='inventory-service',
value_deserializer=lambda m: json.loads(m.decode('utf-8'))
)
for msg in consumer:
event = msg.value
if event['event_type'] == 'OrderCreated':
# 扣减库存(异步处理)
inventory_service.deduct(event['items'])
# 发货通知(异步处理)
notification_service.send_sms(event['user_id'], '订单已创建')
二、服务注册与发现
2.1 Nacos 注册中心
yaml
# ===== Nacos 服务注册 =====
# Spring Boot 服务注册
# application.yml
spring:
application:
name: order-service
cloud:
nacos:
discovery:
enabled: true
server-addr: nacos-server:8848
namespace: ${NACOS_NAMESPACE:public}
group: ${SERVICE_GROUP:DEFAULT_GROUP}
weight: 1 # 权重(负载均衡)
instance-id: ${spring.application.name}:${server.port}:${random.value}
ephemeral: true # 临时实例(不健康会被剔除)
# 服务元数据
metadata:
version: v1
region: beijing
zone: zone-a
microservices: true
# ===== Nacos 配置中心 =====
# 共享配置(多个服务共用)
# Data ID: shared-datasource.yaml
spring:
datasource:
url: jdbc:mysql://mysql:3306/shop
username: root
password: ${DB_PASSWORD}
# 灰度配置(按版本/标签)
# Data ID: order-service.yaml
# Group: GRAY_GROUP
# 配置内容:
# spring:
# cloud:
# nacos:
# config:
# group: ${NACOS_CONFIG_GROUP:GRAY_GROUP}
# ===== Nacos OpenAPI =====
import requests
NACOS_SERVER = 'http://nacos-server:8848'
# 注册服务实例
def register_instance(service_name, ip, port):
requests.post(f'{NACOS_SERVER}/nacos/v1/ns/instance', params={
'serviceName': service_name,
'ip': ip,
'port': port,
'ephemeral': True,
'weight': 1,
'enabled': True,
'healthy': True,
'clusterName': 'DEFAULT'
})
# 心跳续约(服务保活)
def send_heartbeat(service_name, ip, port, beat_interval=5):
while True:
requests.put(f'{NACOS_SERVER}/nacos/v1/ns/instance/beat', params={
'serviceName': service_name,
'ip': ip,
'port': port,
'beatIntervalSeconds': beat_interval
})
time.sleep(beat_interval)
# 服务发现
def discover_instances(service_name):
resp = requests.get(f'{NACOS_SERVER}/nacos/v1/ns/instance/list', params={
'serviceName': service_name,
'healthyOnly': True
})
return resp.json()['hosts']
# 订阅服务变更(推送模式)
def subscribe_service(service_name, callback):
# 使用 Nacos OpenAPI 订阅
url = f'{NACOS_SERVER}/nacos/v1/ns/instance/list'
params = {'serviceName': service_name}
resp = requests.get(url, params=params)
instances = resp.json()['hosts']
callback(instances)
2.2 负载均衡策略
python
# ===== 服务端负载均衡(Nginx/LB)=====
# Nginx 负载均衡配置
upstream order-service {
server order-1:8080 weight=5;
server order-2:8080 weight=3;
server order-3:8080 weight=2;
}
server {
location /api/orders {
proxy_pass http://order-service;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
}
# ===== 客户端负载均衡(Ribbon/Spring LoadBalancer)=====
# Ribbon 内置策略
# RoundRobinRule: 轮询(默认)
# RandomRule: 随机
# WeightedResponseTimeRule: 响应时间加权
# BestAvailableRule: 选择最小连接数
# ZoneAvoidanceRule: 可用区感知
# 配置 Ribbon 策略
# application.yml
order-service:
ribbon:
NFLoadBalancerRuleClassName: com.netflix.loadbalancer.WeightedResponseTimeRule
ConnectTimeout: 3000
ReadTimeout: 5000
MaxAutoRetries: 2
MaxAutoRetriesNextServer: 3
# ===== Spring Cloud LoadBalancer =====
# 使用 Spring Cloud LoadBalancer(Ribbon 停止维护后的替代)
@LoadBalancerClient(name = "order-service",
configuration = CustomLoadBalancerConfig.class)
@RestController
public class OrderController {
@Autowired
private RestTemplate restTemplate;
@GetMapping("/orders/{id}")
public Order getOrder(@PathVariable Long id) {
// 客户端负载均衡:自动从注册中心获取实例并路由
return restTemplate.getForObject(
"http://order-service/orders/" + id, Order.class
);
}
}
# ===== 自定义负载均衡算法 =====
public class ConsistentHashLoadBalancer implements ServiceInstanceListSupplier {
private final TreeMap<Long, ServiceInstance> circle = new TreeMap<>();
@Override
public List<ServiceInstance> getInstances(TransportRequest request) {
// 一致性哈希:相同 key 永远路由到相同节点
String key = request.getUrl().getPath();
long hash = hash(key);
Map.Entry<Long, ServiceInstance> entry = circle.ceilingEntry(hash);
if (entry == null) {
entry = circle.firstEntry();
}
return Collections.singletonList(entry.getValue());
}
private long hash(String key) {
// MurmurHash 一致性哈希
return MurmurHash.hash64(key);
}
}
三、API 网关与路由
3.1 Spring Cloud Gateway
yaml
# ===== Gateway 配置 =====
# application.yml
spring:
cloud:
gateway:
discovery:
locator:
enabled: true # 开启服务发现路由
lower-case-service-id: true # 小写服务名
# 路由配置
routes:
- id: user-service
uri: lb://user-service # lb = 从注册中心获取
predicates:
- Path=/api/users/**
- Method=GET,POST
filters:
- StripPrefix=1 # 去掉 /api 前缀
- name: RequestRateLimiter
args:
redis-rate-limiter.replenishRate: 100
redis-rate-limiter.burstCapacity: 200
- id: order-service
uri: lb://order-service
predicates:
- Path=/api/orders/**
filters:
- StripPrefix=1
- name: CircuitBreaker
args:
name: orderCircuitBreaker
fallbackUri: forward:/fallback/order
- id: auth-service
uri: lb://auth-service
predicates:
- Path=/api/auth/**
filters:
- StripPrefix=1
# ===== 全局过滤器 =====
@Component
public class AuthGlobalFilter implements GlobalFilter, Ordered {
@Autowired
private JwtUtil jwtUtil;
@Override
public Mono<GatewayFilterChain> filter(ServerWebExchange exchange,
GatewayFilterChain chain) {
String path = exchange.getRequest().getURI().getPath();
// 跳过白名单
if (path.startsWith("/api/auth/") || path.startsWith("/api/public/")) {
return chain.filter(exchange);
}
String token = exchange.getRequest().getHeaders()
.getFirst("Authorization");
if (token == null || !token.startsWith("Bearer ")) {
exchange.getResponse().setStatusCode(HttpStatus.UNAUTHORIZED);
return exchange.getResponse().setComplete();
}
try {
Claims claims = jwtUtil.parseToken(token.substring(7));
String userId = claims.getSubject();
// 传递用户信息到下游服务
ServerHttpRequest mutatedRequest = exchange.getRequest().mutate()
.header("X-User-Id", userId)
.header("X-User-Role", claims.get("role", String.class))
.build();
return chain.filter(
exchange.mutate().request(mutatedRequest).build()
);
} catch (Exception e) {
exchange.getResponse().setStatusCode(HttpStatus.UNAUTHORIZED);
return exchange.getResponse().setComplete();
}
}
@Override
public int getOrder() {
return -100; // 优先级最高
}
}
3.2 Kong API Gateway
yaml
# ===== Kong declarative 配置 =====
# kong.yml
_format_version: "3.0"
services:
- name: user-service
url: http://user-service:8080
routes:
- name: user-route
paths:
- /api/users
methods:
- GET
- POST
strip_path: false
plugins:
- name: rate-limiting
config:
minute: 100
policy: redis
redis_host: redis
- name: cors
config:
origins:
- "https://example.com"
methods:
- GET
- POST
- PUT
- DELETE
headers:
- Authorization
- Content-Type
credentials: true
- name: jwt
config:
key_claim_name: kid
claims_to_verify:
- exp
- name: order-service
url: http://order-service:8080
routes:
- name: order-route
paths:
- /api/orders
plugins:
- name: rate-limiting
config:
minute: 50
policy: local
- name: circuit-breaker
config:
status_code: 503
failure_reason: "Service unavailable"
bubble_upstream_error: false
blocked: false
config:
- name: order-cb
rr: 3 # 连续失败 3 次
rt: 20 # 20 秒后恢复半开
rs: 60 # 熔断 60 秒
consumers:
- username: app-client
jwt_secrets:
- key: app-key-001
algorithm: RS256
rsa_public_key: |
-----BEGIN PUBLIC KEY-----
...
-----END PUBLIC KEY-----
四、分布式事务
4.1 CAP 定理与 BASE 理论
python
# ===== CAP 定理 =====
"""
分布式系统最多同时满足:
- Consistency(一致性):所有节点数据一致
- Availability(可用性):每次请求都有响应
- Partition tolerance(分区容错):网络分区时仍能工作
CA without P: 不可能(网络故障不可避免)
CP without A: 网络故障时牺牲可用性
AP without C: 网络故障时牺牲一致性(最终一致)
实际选择:
- ZooKeeper: CP(选举期间不可用)
- Eureka: AP(注册信息最终一致)
- Kafka: AP(分区时仍可写入)
"""
# ===== BASE 理论 =====
"""
Basically Available: 基本可用
- 允许系统在故障时降级(响应慢但有结果)
- 例:秒杀时返回"系统繁忙"而不是报错
Soft state: 软状态
- 允许数据在不同节点间不一致(存在中间状态)
- 中间状态时间窗口尽量短
Eventually consistent: 最终一致
- 系统在经过一段时间后,数据最终达到一致
- 例:订单状态变更后,库存服务异步同步
"""
# ===== 事务模式选择决策树 =====
def choose_transaction_mode(scenario):
if scenario['consistency_required'] and scenario['cross_service']:
if scenario['sync_required']:
return 'Saga(编排模式)' # 强一致,但需补偿
else:
return 'TCC' # 两阶段提交,Try-Confirm-Cancel
elif scenario['performance_required']:
return '可靠消息最终一致' # MQ + 本地消息表
elif scenario['simple_scenario']:
return 'Seata AT 模式' # 自动补偿,需 XA 支持
else:
return '最终一致 + 人工介入'
4.2 Seata AT 模式
sql
-- ===== Seata AT 模式(自动补偿)=====
-- Seata AT 模式工作原理:
-- 1. 每个参与者(微服务)有自己的 Undo Log 表
-- 2. 全局事务管理器(TC)协调各分支事务
-- 3. 失败时自动回滚(根据 Undo Log)
-- ===== AT 模式配置 =====
-- seata.conf
-- transport {
-- type = "TCP"
-- server = "nacos"
-- serverAddr = "127.0.0.1:8848"
-- serviceRegistry {
-- type = "nacos"
-- nacos {
-- namespace = ""
-- serverAddr = "127.0.0.1:8848"
-- }
-- }
-- }
-- application.yml
-- seata:
-- enabled: true
-- tx-service-group: my_test_tx_group
-- service:
-- vgroup-mapping:
-- my_test_tx_group: default
-- grouplist:
-- default: 127.0.0.1:8091
-- ===== 使用 @GlobalTransactional =====
-- OrderService.java
@Service
public class OrderService {
@Autowired
private OrderMapper orderMapper;
@GlobalTransactional(name = "create-order", rollbackFor = Exception.class)
public Order createOrder(Long userId, List<OrderItem> items) {
// 1. 创建订单(自动注册分支事务)
Order order = new Order();
order.setUserId(userId);
order.setStatus("PENDING");
orderMapper.insert(order);
// 2. 远程调用库存服务扣减库存
// 如果库存服务失败,全局事务回滚
inventoryClient.deduct(items);
// 3. 远程调用支付服务扣款
paymentClient.charge(userId, order.getAmount());
// 4. 更新订单状态
order.setStatus("PAID");
orderMapper.updateById(order);
return order;
}
}
-- Undo Log 表(Seata 自动创建)
-- CREATE TABLE `undo_log` (
-- `id` bigint NOT NULL AUTO_INCREMENT,
-- `branch_id` bigint NOT NULL,
-- `xid` varchar(100) NOT NULL,
-- `context` varchar(128) NOT NULL,
-- `rollback_info` longblob NOT NULL,
-- `log_status` int NOT NULL,
-- `log_created` datetime NOT NULL,
-- `log_modified` datetime NOT NULL,
-- UNIQUE KEY `KEY_UNDOLOG_ID` (`id`),
-- KEY `KEY_UNDOLOG_XID` (`xid`)
-- );
4.3 Saga 模式
python
# ===== Saga 编排模式 =====
# 使用 Camunda 或 Temporal 实现 Saga 编排
from temporalio import workflow, activity
from datetime import timedelta
# 定义活动(Activities)
@activity.defn
async def create_order_activity(user_id, items):
"""创建订单"""
order = order_db.create(user_id, items)
return {'order_id': order.id, 'status': 'CREATED'}
@activity.defn
async def deduct_inventory_activity(items):
"""扣减库存"""
for item in items:
inventory_db.deduct(item['product_id'], item['quantity'])
return {'success': True}
@activity.defn
async def charge_payment_activity(user_id, amount):
"""扣款"""
payment_db.charge(user_id, amount)
return {'success': True}
@activity.defn
async def send_notification_activity(user_id, order_id):
"""发送通知"""
notification_service.send(user_id, f'订单 {order_id} 已创建')
return {'success': True}
# 补偿活动(Compensations)
@activity.defn
async def cancel_order_activity(order_id):
"""取消订单(补偿)"""
order_db.cancel(order_id)
return {'success': True}
@activity.defn
async def restore_inventory_activity(items):
"""恢复库存(补偿)"""
for item in items:
inventory_db.restore(item['product_id'], item['quantity'])
return {'success': True}
@activity.defn
async def refund_payment_activity(user_id, amount):
"""退款(补偿)"""
payment_db.refund(user_id, amount)
return {'success': True}
# 定义 Saga 工作流
@workflow.defn
class OrderSagaWorkflow:
@workflow.run
async def run(self, user_id, items, amount) -> dict:
try:
# 步骤 1: 创建订单
order_result = await workflow.execute_activity(
create_order_activity,
user_id, items,
start_to_close_timeout=timedelta(seconds=10)
)
# 步骤 2: 扣减库存
await workflow.execute_activity(
deduct_inventory_activity,
items,
start_to_close_timeout=timedelta(seconds=10),
retry_policy=workflow.RetryPolicy(maximum_attempts=3)
)
# 步骤 3: 扣款
await workflow.execute_activity(
charge_payment_activity,
user_id, amount,
start_to_close_timeout=timedelta(seconds=30)
)
# 步骤 4: 发送通知
await workflow.execute_activity(
send_notification_activity,
user_id, order_result['order_id'],
start_to_close_timeout=timedelta(seconds=5)
)
return {'status': 'SUCCESS', 'order_id': order_result['order_id']}
except Exception as e:
# 补偿逻辑(Saga 自动按反序执行补偿活动)
await workflow.execute_activity(
refund_payment_activity,
user_id, amount,
start_to_close_timeout=timedelta(seconds=10)
)
await workflow.execute_activity(
restore_inventory_activity,
items,
start_to_close_timeout=timedelta(seconds=10)
)
await workflow.execute_activity(
cancel_order_activity,
order_result['order_id'],
start_to_close_timeout=timedelta(seconds=10)
)
return {'status': 'COMPENSATED', 'error': str(e)}
五、服务网格 Istio
5.1 流量管理
yaml
# ===== Istio VirtualService 路由配置 =====
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: order-service
spec:
hosts:
- order-service
http:
- name: primary-route
match:
- headers:
x-request-type:
exact: api
route:
- destination:
host: order-service
subset: v1
weight: 80
- destination:
host: order-service
subset: v2
weight: 20
- name: grpc-route
match:
- uri:
prefix: "/order.v2"
route:
- destination:
host: order-service-v2
subset: stable
weight: 100
---
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: order-service
spec:
host: order-service
trafficPolicy:
connectionPool:
tcp:
maxConnections: 100
http:
h2UpgradePolicy: UPGRADE
http1MaxPendingRequests: 100
http2MaxRequests: 1000
loadBalancer:
consistentHash:
httpCookie:
name: user
ttl: 0s
circuitBreaker:
consecutive5xxErrors: 5
interval: 10s
baseEjectionTime: 30s
subsets:
- name: v1
labels:
version: v1.0.0
- name: v2
labels:
version: v2.0.0
5.2 熔断与限流
yaml
# ===== 熔断配置(DestinationRule)=====
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: inventory-service
spec:
host: inventory-service
trafficPolicy:
outlierDetection:
consecutive5xxErrors: 5 # 连续 5 次 5xx 触发熔断
interval: 30s # 检测间隔
baseEjectionTime: 30s # 最小熔断时间
maxEjectionPercent: 50 # 最多剔除 50% 实例
minHealthPercent: 30 # 保持至少 30% 健康实例
---
# ===== 限流配置(EnvoyFilter)=====
apiVersion: networking.istio.io/v1beta1
kind: EnvoyFilter
metadata:
name: rate-limit-filter
spec:
workloadSelector:
labels:
app: order-service
configPatches:
- applyTo: HTTP_FILTER
match:
context: SIDECAR_INBOUND
listener:
portNumber: 8080
filterChain:
filter:
name: envoy.filters.network.http_connection_manager
patch:
operation: INSERT_BEFORE
value:
name: envoy.filters.http.local_ratelimit
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.http.local_ratelimit.v3.LocalRateLimit
stat_prefix: ext_authz_rate_limiter
token_bucket:
max_tokens: 1000
tokens_per_fill: 1000
fill_interval: 60s
filter_enabled:
runtime_fraction:
default_value:
numerator: 100
denominator: HUNDRED
六、可观测性
6.1 链路追踪
yaml
# ===== OpenTelemetry 埋点 =====
# Spring Boot 配置
spring:
application:
name: order-service
otlp:
tracing:
endpoint: http://otel-collector:4317
exporter: otlp
# ===== 手动埋点(Java)=====
import io.opentelemetry.api.trace.Tracer;
import io.opentelemetry.api.trace.Span;
import io.opentelemetry.context.Scope;
@Service
public class OrderService {
private final Tracer tracer;
@Autowired
public OrderService(Tracer tracer) {
this.tracer = tracer;
}
public Order createOrder(Long userId, List<OrderItem> items) {
// 创建 span(跨度)
Span span = tracer.spanBuilder("OrderService.createOrder")
.setAttribute("user.id", userId)
.setAttribute("order.items.count", items.size())
.startSpan();
try (Scope scope = span.makeCurrent()) {
// 业务逻辑
Order order = doCreateOrder(userId, items);
// 添加结果
span.setAttribute("order.id", order.getId());
span.setAttribute("order.amount", order.getAmount());
return order;
} catch (Exception e) {
span.recordException(e);
span.setStatus(StatusCode.ERROR, e.getMessage());
throw e;
} finally {
span.end(); // 结束 span
}
}
private Order doCreateOrder(Long userId, List<OrderItem> items) {
// 内部调用(自动生成子 span)
Span childSpan = tracer.spanBuilder("doCreateOrder")
.setParent(span)
.startSpan();
try (Scope scope = childSpan.makeCurrent()) {
// 验证用户
User user = userService.getUser(userId);
// 计算价格
BigDecimal amount = priceService.calculate(items);
// 创建订单
Order order = orderRepository.save(...);
return order;
} finally {
childSpan.end();
}
}
}
# ===== OpenTelemetry Collector 配置 =====
# otel-collector-config.yaml
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
processors:
batch:
timeout: 10s
send_batch_size: 1024
exporters:
jaeger:
endpoint: jaeger:14250
tls:
insecure: true
prometheus:
endpoint: "0.0.0.0:8889"
loki:
endpoint: "http://loki:3100/loki/api/v1/push"
service:
pipelines:
traces:
receivers: [otlp]
processors: [batch]
exporters: [jaeger]
metrics:
receivers: [otlp]
processors: [batch]
exporters: [prometheus]
logs:
receivers: [otlp]
processors: [batch]
exporters: [loki]
6.2 日志聚合
python
# ===== 结构化日志(JSON 格式)=====
import logging
import json
from datetime import datetime
class StructuredLogger:
def __init__(self, service_name):
self.service_name = service_name
self.logger = logging.getLogger(service_name)
def log(self, level, event, **kwargs):
log_entry = {
'timestamp': datetime.now().isoformat(),
'service': self.service_name,
'level': level,
'event': event,
**kwargs
}
self.logger.log(
getattr(logging, level),
json.dumps(log_entry)
)
def info(self, event, **kwargs):
self.log('INFO', event, **kwargs)
def error(self, event, **kwargs):
self.log('ERROR', event, **kwargs)
def warn(self, event, **kwargs):
self.log('WARNING', event, **kwargs)
logger = StructuredLogger('order-service')
# 使用
logger.info('order_created',
order_id='12345',
user_id='1001',
amount=199.99,
duration_ms=45
)
# 输出 JSON
# {"timestamp": "2024-06-10T12:00:00", "service": "order-service",
# "level": "INFO", "event": "order_created", "order_id": "12345", ...}
# ===== Loki 查询示例 =====
# LogQL 查询语法
# 查询 order-service 所有错误日志
{service="order-service"} |= "ERROR"
# 查询包含 order_id 的日志
{service="order-service"} | json | order_id="12345"
# 统计每分钟错误数
sum by (level) (
count_over_time(
{service="order-service"} |= "ERROR" [1m]
)
)
# 关联链路追踪(Jaeger Trace ID)
{service="order-service"} | json | trace_id=`${trace_id}`
6.3 Prometheus 指标
python
# ===== Prometheus 指标定义 =====
from prometheus_client import Counter, Histogram, Gauge, CollectorRegistry
registry = CollectorRegistry()
# 计数器(累加)
order_created_total = Counter(
'order_created_total',
'Total number of orders created',
['status', 'payment_type'],
registry=registry
)
order_created_total.labels(status='success', payment_type='wechat').inc()
# 直方图(分布)
order_processing_duration = Histogram(
'order_processing_duration_seconds',
'Order processing duration',
buckets=[0.01, 0.05, 0.1, 0.5, 1.0, 5.0],
registry=registry
)
with order_processing_duration.time():
process_order()
# 仪表盘(当前值)
active_orders = Gauge(
'active_orders_count',
'Number of active orders',
['region'],
registry=registry
)
active_orders.labels(region='beijing').set(1523)
# ===== Spring Boot Actuator + Micrometer =====
# application.yml
management:
endpoints:
web:
exposure:
include: health,info,metrics,prometheus
endpoint:
health:
show-details: always
metrics:
tags:
application: ${spring.application.name}
export:
prometheus:
enabled: true
# 自定义指标
@Bean
public MeterRegistryCustomizer<MeterRegistry> metricsCommonTags() {
return registry -> registry.config()
.commonTags("application", "order-service")
.commonTags("region", System.getenv("REGION"));
}
# 使用 @Timed 注解自动埋点
@Timed(value = "order.create", description = "Time to create an order")
public Order createOrder(CreateOrderRequest request) {
// ...
}
七、容器化与编排
7.1 Docker Compose 本地开发
yaml
# ===== docker-compose.yml =====
version: '3.8'
services:
# 服务注册中心
nacos:
image: nacos/nacos-server:v2.2.3
environment:
MODE: standalone
SPRING_DATASOURCE_PLATFORM: mysql
MYSQL_SERVICE_HOST: mysql
MYSQL_SERVICE_DB_NAME: nacos_config
MYSQL_SERVICE_PORT: 3306
MYSQL_SERVICE_USER: root
MYSQL_SERVICE_PASSWORD: ${DB_PASSWORD}
ports:
- "8848:8848"
- "9848:9848"
depends_on:
mysql:
condition: service_healthy
# API 网关
gateway:
build: ./gateway
ports:
- "8080:8080"
environment:
SPRING_CLOUD_NACOS_DISCOVERY_SERVER-ADDRESS: nacos:8848
depends_on:
- nacos
# 业务服务
user-service:
build: ./user-service
ports:
- "8081:8080"
environment:
SPRING_CLOUD_NACOS_DISCOVERY_SERVER-ADDRESS: nacos:8848
SPRING_CLOUD_NACOS_CONFIG_SERVER-ADDRESS: nacos:8848
SPRING_DATASOURCE_URL: jdbc:mysql://mysql:3306/shop
SPRING_DATASOURCE_USERNAME: root
SPRING_DATASOURCE_PASSWORD: ${DB_PASSWORD}
depends_on:
- nacos
- mysql
deploy:
resources:
limits:
cpus: '0.5'
memory: 512M
reservations:
cpus: '0.1'
memory: 128M
# MySQL
mysql:
image: mysql:8.0
environment:
MYSQL_ROOT_PASSWORD: ${DB_PASSWORD}
MYSQL_DATABASE: shop
ports:
- "3306:3306"
volumes:
- mysql-data:/var/lib/mysql
- ./init.sql:/docker-entrypoint-initdb.d/init.sql
healthcheck:
test: ["CMD", "mysqladmin", "ping", "-h", "localhost"]
interval: 10s
timeout: 5s
retries: 5
# Redis
redis:
image: redis:7-alpine
ports:
- "6379:6379"
volumes:
- redis-data:/data
# Kafka
kafka:
image: confluentinc/cp-kafka:7.5.0
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
depends_on:
- zookeeper
ports:
- "9092:9092"
# Jaeger 链路追踪
jaeger:
image: jaegertracing/all-in-one:1.50
ports:
- "6831:6831/udp"
- "16686:16686"
# Prometheus
prometheus:
image: prom/prometheus:v2.47.0
ports:
- "9090:9090"
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
depends_on:
- user-service
- order-service
volumes:
mysql-data:
redis-data:
7.2 Kubernetes 部署
yaml
# ===== Deployment + Service =====
apiVersion: apps/v1
kind: Deployment
metadata:
name: order-service
labels:
app: order-service
version: v1
spec:
replicas: 3
selector:
matchLabels:
app: order-service
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
template:
metadata:
labels:
app: order-service
version: v1
spec:
containers:
- name: order-service
image: registry.example.com/order-service:v1.0.0
ports:
- containerPort: 8080
resources:
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 500m
memory: 1Gi
readinessProbe:
httpGet:
path: /actuator/health
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
successThreshold: 1
failureThreshold: 3
livenessProbe:
httpGet:
path: /actuator/health
port: 8080
initialDelaySeconds: 60
periodSeconds: 15
env:
- name: SPRING_CLOUD_NACOS_DISCOVERY_SERVER-ADDRESS
valueFrom:
configMapKeyRef:
name: microservice-config
key: nacos.address
- name: SPRING_DATASOURCE_PASSWORD
valueFrom:
secretKeyRef:
name: db-secret
key: password
---
apiVersion: v1
kind: Service
metadata:
name: order-service
spec:
type: ClusterIP
selector:
app: order-service
ports:
- port: 80
targetPort: 8080
---
apiVersion: v1
kind: Service
metadata:
name: order-service-headless
spec:
type: ClusterIP
clusterIP: None # 无头服务
selector:
app: order-service
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: order-service-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: order-service
minReplicas: 3
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
八、总结
技术全景
| 层 | 核心概念 | 关键点 |
|---|---|---|
| 拆分原则 | DDD + 康威定律 | 服务数=团队×2~3 |
| 服务通信 | REST/gRPC + MQ | 同步低延迟,异步解耦 |
| 注册发现 | Nacos + Eureka | 客户端负载均衡 |
| API 网关 | Gateway + Kong | 统一入口 + 鉴权 + 限流 |
| 分布式事务 | Seata AT + Saga | 最终一致 vs 强一致 |
| 服务网格 | Istio | 流量管理 + 熔断 |
| 链路追踪 | OpenTelemetry + Jaeger | 全链路 span |
| 日志 | 结构化 JSON + ELK/Loki | trace_id 关联 |
| 监控 | Prometheus + Grafana | 黄金指标 |
| K8s | HPA + 滚动更新 | 自动扩缩容 |
微服务技术选型矩阵
| 场景 | 推荐方案 | 备选方案 |
|---|---|---|
| 服务注册发现 | Nacos | Consul / Eureka |
| 配置中心 | Apollo | Nacos Config |
| API 网关 | Spring Cloud Gateway | Kong / Envoy |
| 分布式事务 | Seata AT | Saga(Temporal) |
| 消息队列 | Kafka | RabbitMQ / RocketMQ |
| 链路追踪 | SkyWalking | Jaeger / Zipkin |
| 日志聚合 | Loki | ELK Stack |
| 服务网格 | Istio | Linkerd |
| 容器编排 | Kubernetes | Docker Swarm |
本文涵盖微服务架构完整知识:DDD 领域驱动拆分原则、Spring Cloud 微服务技术栈、Nacos 注册中心与服务发现、Gateway/Kong API 网关、Seata AT/Saga 分布式事务、Istio 服务网格流量管理、可观测性(链路追踪+日志聚合+指标监控)、Docker Compose 本地开发、Kubernetes 生产部署与自动扩缩容。