【架构实战】云原生架构设计:从传统架构到云原生的蜕变

一、我们为什么需要云原生

2019年,我们的系统还运行在传统的物理机上。部署一次应用,需要:

  1. 申请服务器(1-2周)
  2. 配置环境(1-2天)
  3. 部署应用(半天)
  4. 配置监控(半天)

如果遇到突发流量,比如做活动,根本来不及扩容。活动结束后,服务器又闲置了。

后来,我们开始尝试Docker容器化。容器的好处立竿见影:

  • 环境一致:开发、测试、生产完全一致
  • 秒级启动:比物理机快了100倍
  • 弹性伸缩:几分钟就能扩缩容

但这还不够。我们需要的不仅是容器化,而是云原生------一套完整的以云为基础的架构理念和方法论。


二、容器化实践

2.1 Docker最佳实践

dockerfile 复制代码
# 多阶段构建:减小镜像大小
FROM maven:3.8-openjdk-8 AS builder
WORKDIR /app
COPY pom.xml .
RUN mvn dependency:go-offline
COPY src ./src
RUN mvn clean package -DskipTests

# 运行时镜像:只包含运行时
FROM eclipse-temurin:8-jre-alpine
WORKDIR /app

# 安全:创建非root用户
RUN addgroup -S appgroup && adduser -S appuser -G appgroup

# 复制构建产物
COPY --from=builder /app/target/*.jar app.jar

# 设置权限
RUN chown -R appuser:appgroup /app
USER appuser

# 健康检查
HEALTHCHECK --interval=30s --timeout=3s --start-period=60s \
    CMD wget -qO- http://localhost:8080/actuator/health || exit 1

# JVM优化
ENV JAVA_OPTS="-XX:+UseG1GC -XX:MaxRAMFraction=2 -XX:+ExitOnOutOfMemoryError"

ENTRYPOINT ["sh", "-c", "java $JAVA_OPTS -jar app.jar"]

2.2 Docker Compose本地开发环境

yaml 复制代码
version: '3.8'
services:
  app:
    build: .
    ports:
      - "8080:8080"
    environment:
      - SPRING_PROFILES_ACTIVE=local
      - SPRING_DATASOURCE_URL=jdbc:mysql://mysql:3306/testdb
      - SPRING_REDIS_HOST=redis
    depends_on:
      - mysql
      - redis
    volumes:
      - ./logs:/app/logs
    networks:
      - app-network
  
  mysql:
    image: mysql:8.0
    environment:
      - MYSQL_ROOT_PASSWORD=root123
      - MYSQL_DATABASE=testdb
    ports:
      - "3306:3306"
    volumes:
      - mysql-data:/var/lib/mysql
      - ./init.sql:/docker-entrypoint-initdb.d/init.sql
    networks:
      - app-network
  
  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"
    volumes:
      - redis-data:/data
    networks:
      - app-network
  
  jaeger:
    image: jaegertracing/all-in-one:latest
    ports:
      - "16686:16686"
    networks:
      - app-network

volumes:
  mysql-data:
  redis-data:

networks:
  app-network:
    driver: bridge

三、Kubernetes核心概念

3.1 Pod配置

yaml 复制代码
apiVersion: v1
kind: Pod
metadata:
  name: order-service-pod
  labels:
    app: order-service
    version: v1
spec:
  # 优雅终止
  terminationGracePeriodSeconds: 60
  
  # 初始化容器
  initContainers:
    - name: init-db
      image: busybox:1.36
      command:
        - sh
        - -c
        - |
          echo "Waiting for database to be ready..."
          until nc -z mysql 3306; do
            sleep 2
          done
          echo "Database is ready!"
  
  containers:
    - name: order-service
      image: registry.example.com/order-service:v1.2.3
      ports:
        - containerPort: 8080
          name: http
        - containerPort: 9090
          name: grpc
      
      # 资源限制
      resources:
        requests:
          memory: "512Mi"
          cpu: "250m"
        limits:
          memory: "1Gi"
          cpu: "1000m"
      
      # 健康检查
      livenessProbe:
        httpGet:
          path: /actuator/health/liveness
          port: 8080
        initialDelaySeconds: 60
        periodSeconds: 10
        failureThreshold: 3
      
      readinessProbe:
        httpGet:
          path: /actuator/health/readiness
          port: 8080
        initialDelaySeconds: 30
        periodSeconds: 5
        failureThreshold: 3
      
      # 环境变量
      env:
        - name: SPRING_PROFILES_ACTIVE
          value: "production"
        - name: JAVA_OPTS
          value: "-Xmx768m -Xms512m -XX:+UseG1GC"
      
      # Volume挂载
      volumeMounts:
        - name: app-logs
          mountPath: /app/logs
        - name: config
          mountPath: /app/config
          readOnly: true
  
  # 数据卷
  volumes:
    - name: app-logs
      emptyDir: {}
    - name: config
      configMap:
        name: order-service-config
  
  # 亲和性调度
  affinity:
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 100
          podAffinityTerm:
            labelSelector:
              matchExpressions:
                - key: app
                  operator: In
                  values:
                    - order-service
            topologyKey: kubernetes.io/hostname

3.2 Deployment配置

yaml 复制代码
apiVersion: apps/v1
kind: Deployment
metadata:
  name: order-service
  labels:
    app: order-service
spec:
  replicas: 3
  
  # 滚动更新策略
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1          # 最多超出1个Pod
      maxUnavailable: 0    # 不能少于期望副本数
  
  selector:
    matchLabels:
      app: order-service
  
  template:
    metadata:
      labels:
        app: order-service
        version: v1
    spec:
      containers:
        - name: order-service
          image: registry.example.com/order-service:v1.2.3
          ports:
            - containerPort: 8080
          
          # 滚动更新探针
          readinessGates:
            - conditionType: "PrometheusReady"
          
          env:
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name

---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: order-service-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: order-service
  minReplicas: 3
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70
    - type: Pods
      pods:
        metric:
          name: http_requests_per_second
        target:
          type: AverageValue
          averageValue: "1000"
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
        - type: Percent
          value: 10
          periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 0
      policies:
        - type: Percent
          value: 100
          periodSeconds: 15

四、服务网格Istio

yaml 复制代码
# Istio VirtualService:流量管理
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: order-service
spec:
  hosts:
    - order-service
  http:
    - match:
        - headers:
            x-canary:
              exact: "true"
      route:
        - destination:
            host: order-service
            subset: v2
          weight: 100
    - route:
        - destination:
            host: order-service
            subset: v1
          weight: 90
        - destination:
            host: order-service
            subset: v2
          weight: 10   # 金丝雀:10%流量到v2

---
# DestinationRule:熔断配置
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
  name: order-service
spec:
  host: order-service
  trafficPolicy:
    outlierDetection:
      consecutiveGatewayErrors: 5
      interval: 30s
      baseEjectionTime: 30s
      maxEjectionPercent: 50
    connectionPool:
      tcp:
        maxConnections: 100
      http:
        h2UpgradePolicy: UPGRADE
        http1MaxPendingRequests: 100
        http2MaxRequests: 1000

五、踩坑实录

坑1:Pod启动顺序问题

我们有个服务需要连接数据库,但Pod启动时数据库还没完全准备好,导致启动失败,CrashLoopBackOff。

解决:使用Init Container等待依赖服务就绪:

yaml 复制代码
initContainers:
  - name: wait-for-db
    image: busybox:1.36
    command: ['sh', '-c', 'until nc -z mysql 3306; do sleep 2; done']

坑2:OOMKilled但没有告警

Pod被OOMKill杀掉,但没有告警,因为Pod直接消失了。

解决

  1. 设置合理的资源limits
  2. 添加OOMKilled告警
  3. 使用PreStop Hook优雅关闭

六、总结

云原生架构的核心要点:

  1. 容器化:环境一致,快速部署
  2. 编排:Kubernetes自动化管理
  3. 微服务:独立部署,独立扩缩容
  4. 服务网格:流量管理,可观测性
  5. 声明式配置:GitOps,以代码管理基础设施

云原生不是目的,提升交付效率才是。


个人观点,仅供参考