CI/CD 配置完全指南：从零开始构建自动化流水线

CI/CD 简介

什么是 CI/CD

CI/CD 是持续集成 (Continuous Integration)、持续交付 (Continuous Delivery) 和持续部署 (Continuous Deployment) 的缩写，是一套通过自动化来频繁向客户交付应用的实践方法。

持续集成 (CI)：开发人员频繁地合并代码到中央仓库，每次合并都会自动触发构建和测试。
持续交付 (CD)：将通过测试的代码自动部署到预发布环境。
持续部署：将通过预发布环境验证的代码自动部署到生产环境。

CI/CD 的优势

减少手动错误
加快反馈周期
提高代码质量和发布频率
降低发布风险
增强团队协作效率

CI/CD 工具选择

主流 CI/CD 工具对比

工具	语言支持	托管方式	易用性	扩展性	成本
Jenkins	全部	自托管	中	高	免费
GitLab CI/CD	全部	自托管/SaaS	高	中	免费/付费
GitHub Actions	全部	SaaS	高	高	免费/付费
Travis CI	多语言	SaaS	高	中	免费/付费
CircleCI	全部	SaaS	高	中	免费/付费

选择建议

小型项目/个人项目：GitHub Actions 或 GitLab CI/CD
企业级项目：Jenkins 或 GitLab CI/CD (自托管)
云原生应用：GitHub Actions 或云厂商提供的 CI/CD 服务

基础环境配置

1. 版本控制设置

Git 配置

bash 复制代码

# 全局配置
git config --global user.name "Your Name"
git config --global user.email "your.email@example.com"

# 初始化仓库
git init
git add .
git commit -m "Initial commit"

# 连接远程仓库
git remote add origin https://github.com/username/repository.git
git push -u origin main

.gitignore 配置

gitignore 复制代码

# Python
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
env/
venv/
.env
.venv
pip-log.txt
pip-delete-this-directory.txt

# Node.js
node_modules/
npm-debug.log*
yarn-debug.log*
yarn-error.log*

# IDE
.vscode/
.idea/
*.swp
*.swo

# OS
.DS_Store
Thumbs.db

2. 项目结构规范

复制代码

project-root/
├── .github/
│   └── workflows/          # GitHub Actions 工作流
├── .gitlab-ci.yml          # GitLab CI 配置
├── Jenkinsfile             # Jenkins 流水线配置
├── src/                    # 源代码
├── tests/                  # 测试代码
├── docs/                   # 文档
├── requirements.txt        # Python 依赖
├── package.json            # Node.js 依赖
├── Dockerfile              # Docker 配置
└── docker-compose.yml      # Docker Compose 配置

CI 流水线设置

1. GitHub Actions 配置

创建工作流文件

yaml 复制代码

# .github/workflows/ci.yml
name: CI Pipeline

on:
  push:
    branches: [ main, develop ]
  pull_request:
    branches: [ main ]

jobs:
  test:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        python-version: [3.8, 3.9, '3.10', '3.11']

    steps:
    - uses: actions/checkout@v3
    
    - name: Set up Python ${{ matrix.python-version }}
      uses: actions/setup-python@v3
      with:
        python-version: ${{ matrix.python-version }}
    
    - name: Install dependencies
      run: |
        python -m pip install --upgrade pip
        pip install -r requirements.txt
        pip install pytest pytest-cov
    
    - name: Run tests with coverage
      run: |
        pytest --cov=./src --cov-report=xml
    
    - name: Upload coverage to Codecov
      uses: codecov/codecov-action@v3

  lint:
    runs-on: ubuntu-latest
    
    steps:
    - uses: actions/checkout@v3
    
    - name: Set up Python
      uses: actions/setup-python@v3
      with:
        python-version: '3.10'
    
    - name: Install dependencies
      run: |
        python -m pip install --upgrade pip
        pip install flake8 black
    
    - name: Run flake8
      run: |
        flake8 src/ --count --select=E9,F63,F7,F82 --show-source --statistics
        flake8 src/ --count --exit-zero --max-complexity=10 --max-line-length=88 --statistics
    
    - name: Check code formatting with black
      run: black --check src/

  security:
    runs-on: ubuntu-latest
    
    steps:
    - uses: actions/checkout@v3
    
    - name: Run security scan
      uses: securecodewarrior/github-action-add-sarif@v1
      with:
        sarif-file: 'security-scan-results.sarif'

2. GitLab CI/CD 配置

.gitlab-ci.yml 配置示例

yaml 复制代码

# .gitlab-ci.yml
stages:
  - test
  - build
  - deploy

variables:
  PYTHON_VERSION: "3.10"
  PIP_CACHE_DIR: "$CI_PROJECT_DIR/.cache/pip"

cache:
  paths:
    - .cache/pip
    - venv/

before_script:
  - python -m venv venv
  - source venv/bin/activate
  - pip install --upgrade pip
  - pip install -r requirements.txt

test:
  stage: test
  image: python:$PYTHON_VERSION
  script:
    - pip install pytest pytest-cov
    - pytest --cov=src tests/
  coverage: '/TOTAL.+?(\d+\%)$/'
  artifacts:
    reports:
      coverage_report:
        coverage_format: cobertura
        path: coverage.xml

lint:
  stage: test
  image: python:$PYTHON_VERSION
  script:
    - pip install flake8 black
    - flake8 src/
    - black --check src/

security:
  stage: test
  image: python:$PYTHON_VERSION
  script:
    - pip install bandit
    - bandit -r src/

build:
  stage: build
  image: docker:latest
  services:
    - docker:dind
  script:
    - docker build -t $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA .
    - docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
    - docker push $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
  only:
    - main
    - develop

3. Jenkins 流水线配置

Jenkinsfile 示例

groovy 复制代码

// Jenkinsfile
pipeline {
    agent any
    
    environment {
        PYTHON_VERSION = '3.10'
        DOCKER_REGISTRY = 'your-registry.com'
        IMAGE_NAME = 'your-app'
    }
    
    stages {
        stage('Checkout') {
            steps {
                checkout scm
            }
        }
        
        stage('Setup Environment') {
            steps {
                sh 'python -m venv venv'
                sh 'source venv/bin/activate && pip install --upgrade pip'
                sh 'source venv/bin/activate && pip install -r requirements.txt'
            }
        }
        
        stage('Lint') {
            steps {
                sh 'source venv/bin/activate && pip install flake8 black'
                sh 'source venv/bin/activate && flake8 src/'
                sh 'source venv/bin/activate && black --check src/'
            }
        }
        
        stage('Test') {
            steps {
                sh 'source venv/bin/activate && pip install pytest pytest-cov'
                sh 'source venv/bin/activate && pytest --cov=src tests/'
            }
            post {
                always {
                    publishHTML([
                        allowMissing: false,
                        alwaysLinkToLastBuild: true,
                        keepAll: true,
                        reportDir: 'htmlcov',
                        reportFiles: 'index.html',
                        reportName: 'Coverage Report'
                    ])
                }
            }
        }
        
        stage('Security Scan') {
            steps {
                sh 'source venv/bin/activate && pip install bandit'
                sh 'source venv/bin/activate && bandit -r src/'
            }
        }
        
        stage('Build Docker Image') {
            when {
                branch 'main'
            }
            steps {
                script {
                    def image = docker.build("${env.DOCKER_REGISTRY}/${env.IMAGE_NAME}:${env.BUILD_NUMBER}")
                    docker.withRegistry("https://${env.DOCKER_REGISTRY}", 'docker-registry-credentials') {
                        image.push()
                        image.push('latest')
                    }
                }
            }
        }
    }
    
    post {
        always {
            cleanWs()
        }
        success {
            echo 'Pipeline succeeded!'
        }
        failure {
            echo 'Pipeline failed!'
            emailext (
                subject: "Failed Pipeline: ${env.JOB_NAME} - ${env.BUILD_NUMBER}",
                body: "Check console output at ${env.BUILD_URL}",
                to: "${env.CHANGE_AUTHOR_EMAIL}"
            )
        }
    }
}

CD 流水线设置

1. 部署策略

蓝绿部署

yaml 复制代码

# .github/workflows/deploy-blue-green.yml
name: Blue-Green Deployment

on:
  workflow_run:
    workflows: ["CI Pipeline"]
    types:
      - completed
    branches: [main]

jobs:
  deploy:
    runs-on: ubuntu-latest
    if: ${{ github.event.workflow_run.conclusion == 'success' }}
    
    steps:
    - uses: actions/checkout@v3
    
    - name: Setup kubectl
      uses: azure/setup-kubectl@v1
      with:
        version: 'v1.24.0'
    
    - name: Configure AWS Credentials
      uses: aws-actions/configure-aws-credentials@v1
      with:
        aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
        aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
        aws-region: us-east-1
    
    - name: Deploy to Blue Environment
      run: |
        # 更新蓝色环境
        kubectl set image deployment/app-blue app=${{ github.event.workflow_run.head_sha }} --namespace=production
        
        # 等待蓝色环境就绪
        kubectl rollout status deployment/app-blue --namespace=production
        
        # 运行健康检查
        kubectl run health-check --image=curlimages/curl --rm -i --restart=Never -- \
          curl -f http://app-blue-service/health
    
    - name: Switch Traffic to Blue
      run: |
        # 切换流量到蓝色环境
        kubectl patch service app-service -p '{"spec":{"selector":{"version":"blue"}}}' --namespace=production

滚动更新

yaml 复制代码

# .github/workflows/rolling-update.yml
name: Rolling Update Deployment

on:
  push:
    branches: [main]

jobs:
  deploy:
    runs-on: ubuntu-latest
    
    steps:
    - uses: actions/checkout@v3
    
    - name: Setup Helm
      uses: azure/setup-helm@v1
      with:
        version: 'v3.8.0'
    
    - name: Deploy with Helm
      run: |
        helm upgrade --install app ./helm-chart \
          --set image.tag=${{ github.sha }} \
          --set image.repository=${{ secrets.DOCKER_REGISTRY }}/app \
          --namespace=production \
          --wait \
          --timeout=10m

金丝雀发布

yaml 复制代码

# .github/workflows/canary-deployment.yml
name: Canary Deployment

on:
  push:
    branches: [main]

jobs:
  deploy-canary:
    runs-on: ubuntu-latest
    
    steps:
    - uses: actions/checkout@v3
    
    - name: Deploy Canary Version
      run: |
        # 部署金丝雀版本 (10% 流量)
        kubectl apply -f k8s/canary-deployment.yaml
        
        # 配置流量分割
        kubectl apply -f k8s/canary-service.yaml
    
    - name: Monitor Canary
      run: |
        # 监控金丝雀版本健康状况
        for i in {1..10}; do
          STATUS=$(kubectl get canary app-canary -o jsonpath='{.status.phase}')
          if [ "$STATUS" == "Progressing" ]; then
            echo "Canary deployment is progressing..."
            sleep 30
          elif [ "$STATUS" == "Healthy" ]; then
            echo "Canary is healthy, promoting to full rollout"
            break
          else
            echo "Canary failed, rolling back"
            kubectl delete -f k8s/canary-deployment.yaml
            exit 1
          fi
        done
    
    - name: Promote to Full Rollout
      run: |
        # 推广到全量发布
        kubectl patch deployment app -p '{"spec":{"template":{"spec":{"containers":[{"name":"app","image":"'${{ secrets.DOCKER_REGISTRY }}/app:${{ github.sha }}'"}]}}}}'
        kubectl delete -f k8s/canary-deployment.yaml

2. 基础设施即代码 (IaC)

Terraform 配置

hcl 复制代码

# terraform/main.tf
provider "aws" {
  region = var.aws_region
}

# VPC 配置
resource "aws_vpc" "main" {
  cidr_block           = "10.0.0.0/16"
  enable_dns_hostnames = true
  enable_dns_support   = true

  tags = {
    Name = "main-vpc"
  }
}

# EKS 集群
resource "aws_eks_cluster" "main" {
  name     = var.cluster_name
  role_arn = aws_iam_role.eks_cluster.arn
  version  = "1.24"

  vpc_config {
    subnet_ids = concat(
      aws_subnet.public.*.id,
      aws_subnet.private.*.id
    )
  }

  depends_on = [
    aws_iam_role_policy_attachment.eks_cluster_policy,
  ]
}

# 应用部署
resource "kubernetes_deployment" "app" {
  metadata {
    name = "app"
    labels = {
      app = "my-app"
    }
  }

  spec {
    replicas = 3

    selector {
      match_labels = {
        app = "my-app"
      }
    }

    template {
      metadata {
        labels = {
          app = "my-app"
        }
      }

      spec {
        container {
          image = "${var.docker_registry}/app:${var.image_tag}"
          name  = "app"
          
          port {
            container_port = 8080
          }
          
          resources {
            limits = {
              cpu    = "0.5"
              memory = "512Mi"
            }
            requests = {
              cpu    = "250m"
              memory = "50Mi"
            }
          }
        }
      }
    }
  }

  depends_on = [
    aws_eks_cluster.main,
  ]
}

# 服务暴露
resource "kubernetes_service" "app" {
  metadata {
    name = "app-service"
  }

  spec {
    selector = {
      app = kubernetes_deployment.app.metadata[0].labels.app
    }

    port {
      protocol    = "TCP"
      port        = 80
      target_port = 8080
    }

    type = "LoadBalancer"
  }

  depends_on = [
    kubernetes_deployment.app,
  ]
}

Terraform CI/CD 集成

yaml 复制代码

# .github/workflows/terraform.yml
name: Terraform CI/CD

on:
  push:
    paths:
      - 'terraform/**'
    branches: [main]
  pull_request:
    paths:
      - 'terraform/**'
    branches: [main]

jobs:
  terraform:
    runs-on: ubuntu-latest
    
    steps:
    - uses: actions/checkout@v3
    
    - name: Setup Terraform
      uses: hashicorp/setup-terraform@v1
      with:
        terraform_version: 1.3.0
    
    - name: Configure AWS Credentials
      uses: aws-actions/configure-aws-credentials@v1
      with:
        aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
        aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
        aws-region: us-east-1
    
    - name: Terraform Format Check
      run: terraform fmt -check
      working-directory: ./terraform
    
    - name: Terraform Init
      run: terraform init
      working-directory: ./terraform
    
    - name: Terraform Validate
      run: terraform validate
      working-directory: ./terraform
    
    - name: Terraform Plan
      if: github.event_name == 'pull_request'
      run: terraform plan -out=plan.tfplan
      working-directory: ./terraform
    
    - name: Terraform Apply
      if: github.ref == 'refs/heads/main' && github.event_name == 'push'
      run: terraform apply -auto-approve
      working-directory: ./terraform

高级配置与最佳实践

1. 多环境管理

环境配置结构

复制代码

configs/
├── environments/
│   ├── development/
│   │   ├── variables.yml
│   │   └── config.yml
│   ├── staging/
│   │   ├── variables.yml
│   │   └── config.yml
│   └── production/
│       ├── variables.yml
│       └── config.yml
└── templates/
    ├── docker-compose.yml.j2
    └── k8s-deployment.yml.j2

多环境部署配置

yaml 复制代码

# .github/workflows/multi-env-deploy.yml
name: Multi-Environment Deployment

on:
  workflow_dispatch:
    inputs:
      environment:
        description: 'Target environment'
        required: true
        default: 'staging'
        type: choice
        options:
          - development
          - staging
          - production

env:
  REGISTRY: ghcr.io
  IMAGE_NAME: ${{ github.repository }}

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v3
    
    - name: Log in to Container Registry
      uses: docker/login-action@v2
      with:
        registry: ${{ env.REGISTRY }}
        username: ${{ github.actor }}
        password: ${{ secrets.GITHUB_TOKEN }}
    
    - name: Extract metadata
      id: meta
      uses: docker/metadata-action@v4
      with:
        images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
    
    - name: Build and push Docker image
      uses: docker/build-push-action@v4
      with:
        context: .
        push: true
        tags: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }}
        labels: ${{ steps.meta.outputs.labels }}
  
  deploy:
    needs: build
    runs-on: ubuntu-latest
    environment: ${{ github.event.inputs.environment }}
    
    steps:
    - uses: actions/checkout@v3
    
    - name: Deploy to ${{ github.event.inputs.environment }}
      run: |
        # 使用环境特定配置
        echo "Deploying to ${{ github.event.inputs.environment }}"
        
        # 设置环境变量
        if [ "${{ github.event.inputs.environment }}" == "production" ]; then
          NAMESPACE="production"
          REPLICA_COUNT="5"
          CPU_LIMIT="1000m"
          MEMORY_LIMIT="2Gi"
        elif [ "${{ github.event.inputs.environment }}" == "staging" ]; then
          NAMESPACE="staging"
          REPLICA_COUNT="3"
          CPU_LIMIT="500m"
          MEMORY_LIMIT="1Gi"
        else
          NAMESPACE="development"
          REPLICA_COUNT="1"
          CPU_LIMIT="200m"
          MEMORY_LIMIT="512Mi"
        fi
        
        # 应用配置
        envsubst < k8s/deployment.yml.template | kubectl apply -f -
        
        # 等待部署完成
        kubectl rollout status deployment/app --namespace=$NAMESPACE
      env:
        IMAGE_TAG: ${{ github.sha }}
        NAMESPACE: ${{ github.event.inputs.environment }}
        REPLICA_COUNT: ${{ vars.REPLICA_COUNT }}
        CPU_LIMIT: ${{ vars.CPU_LIMIT }}
        MEMORY_LIMIT: ${{ vars.MEMORY_LIMIT }}

2. 秘钥管理

使用 Kubernetes Secrets

yaml 复制代码

# k8s/secrets.yml
apiVersion: v1
kind: Secret
metadata:
  name: app-secrets
type: Opaque
data:
  database-url: <base64-encoded-database-url>
  api-key: <base64-encoded-api-key>
  jwt-secret: <base64-encoded-jwt-secret>

外部秘密管理

yaml 复制代码

# .github/workflows/secrets-management.yml
name: Secrets Management

on: push

jobs:
  manage-secrets:
    runs-on: ubuntu-latest
    
    steps:
    - uses: actions/checkout@v3
    
    - name: Setup Vault
      uses: hashicorp/vault-action@v2.4.0
      with:
        url: ${{ secrets.VAULT_URL }}
        token: ${{ secrets.VAULT_TOKEN }}
        secrets: |
          secret/data/app/database DATABASE_URL;
          secret/data/app/api API_KEY;
          secret/data/app/jwt JWT_SECRET;
    
    - name: Apply Kubernetes Secrets
      run: |
        # 创建临时密钥文件
        echo $DATABASE_URL | base64 > db-url.b64
        echo $API_KEY | base64 > api-key.b64
        echo $JWT_SECRET | base64 > jwt-secret.b64
        
        # 应用 Kubernetes 密钥
        kubectl apply -f - <<EOF
        apiVersion: v1
        kind: Secret
        metadata:
          name: app-secrets
        type: Opaque
        data:
          database-url: $(cat db-url.b64)
          api-key: $(cat api-key.b64)
          jwt-secret: $(cat jwt-secret.b64)
        EOF
        
        # 清理临时文件
        rm -f *.b64

3. 测试策略

多阶段测试

yaml 复制代码

# .github/workflows/testing.yml
name: Testing Strategy

on: [push, pull_request]

jobs:
  unit-tests:
    runs-on: ubuntu-latest
    
    steps:
    - uses: actions/checkout@v3
    
    - name: Set up Python
      uses: actions/setup-python@v4
      with:
        python-version: '3.10'
    
    - name: Install dependencies
      run: |
        python -m pip install --upgrade pip
        pip install -r requirements-test.txt
    
    - name: Run unit tests
      run: |
        pytest tests/unit/ -v --cov=src --cov-report=xml --cov-report=html
    
    - name: Upload coverage reports
      uses: codecov/codecov-action@v3
      with:
        file: ./coverage.xml

  integration-tests:
    runs-on: ubuntu-latest
    needs: unit-tests
    
    services:
      postgres:
        image: postgres:13
        env:
          POSTGRES_PASSWORD: postgres
          POSTGRES_DB: test_db
        options: >-
          --health-cmd pg_isready
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5
        ports:
          - 5432:5432
      
      redis:
        image: redis:6
        options: >-
          --health-cmd "redis-cli ping"
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5
        ports:
          - 6379:6379
    
    steps:
    - uses: actions/checkout@v3
    
    - name: Set up Python
      uses: actions/setup-python@v4
      with:
        python-version: '3.10'
    
    - name: Install dependencies
      run: |
        python -m pip install --upgrade pip
        pip install -r requirements-test.txt
    
    - name: Wait for services
      run: |
        sleep 10
    
    - name: Run integration tests
      env:
        DATABASE_URL: postgresql://postgres:postgres@localhost:5432/test_db
        REDIS_URL: redis://localhost:6379/0
      run: |
        pytest tests/integration/ -v

  e2e-tests:
    runs-on: ubuntu-latest
    needs: integration-tests
    
    steps:
    - uses: actions/checkout@v3
    
    - name: Set up Node.js
      uses: actions/setup-node@v3
      with:
        node-version: '16'
        cache: 'npm'
        cache-dependency-path: e2e/package-lock.json
    
    - name: Install dependencies
      run: |
        cd e2e
        npm ci
    
    - name: Start application
      run: |
        docker-compose up -d
        sleep 30
    
    - name: Run E2E tests
      run: |
        cd e2e
        npm run test:headless
    
    - name: Stop application
      if: always()
      run: |
        docker-compose down

监控与优化

1. CI/CD 流水线监控

流水线性能监控

yaml 复制代码

# .github/workflows/monitoring.yml
name: Pipeline Monitoring

on:
  workflow_dispatch:
  schedule:
    - cron: '0 6 * * 1'  # 每周一早上6点运行

jobs:
  analyze-performance:
    runs-on: ubuntu-latest
    
    steps:
    - uses: actions/checkout@v3
    
    - name: Fetch workflow runs
      uses: actions/github-script@v6
      id: fetch-runs
      with:
        script: |
          const runs = await github.rest.actions.listWorkflowRuns({
            owner: context.repo.owner,
            repo: context.repo.repo,
            workflow_id: 'ci.yml',
            per_page: 30,
          });
          
          return runs.data.workflow_runs;
    
    - name: Analyze performance data
      run: |
        # 分析流水线执行时间
        node -e "
          const runs = JSON.parse('${{ steps.fetch-runs.outputs.result }}');
          const avgDuration = runs.reduce((sum, run) => sum + (run.updated_at - run.created_at), 0) / runs.length;
          console.log('Average pipeline duration:', Math.round(avgDuration / 1000), 'seconds');
        "
    
    - name: Generate performance report
      run: |
        # 生成性能报告
        cat > performance-report.md << EOF
        # CI/CD Pipeline Performance Report
        
        Generated on: $(date)
        
        ## Metrics
        
        - Average pipeline duration: ${AVG_DURATION} seconds
        - Success rate: ${SUCCESS_RATE}%
        - Average test execution time: ${TEST_TIME} seconds
        
        ## Recommendations
        
        1. Consider caching dependencies to reduce build time
        2. Optimize test suite execution
        3. Review long-running steps for optimization
        EOF
        
        # 上传报告
        gh release create performance-report-$(date +%Y%m%d) ./performance-report.md --title "Performance Report $(date +%Y-%m-%d)" --prerelease
      env:
        GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}

2. 应用监控集成

部署后健康检查

yaml 复制代码

# .github/workflows/health-check.yml
name: Post-Deployment Health Check

on:
  workflow_run:
    workflows: ["Deploy to Production"]
    types:
      - completed
    branches: [main]

jobs:
  health-check:
    runs-on: ubuntu-latest
    if: ${{ github.event.workflow_run.conclusion == 'success' }}
    
    steps:
    - uses: actions/checkout@v3
    
    - name: Health Check
      run: |
        # 基本健康检查
        STATUS_CODE=$(curl -s -o /dev/null -w "%{http_code}" https://api.example.com/health)
        if [ "$STATUS_CODE" -ne 200 ]; then
          echo "Health check failed with status code: $STATUS_CODE"
          exit 1
        fi
        
        # 深度健康检查
        DEEP_CHECK=$(curl -s https://api.example.com/health/deep)
        echo "Deep health check result: $DEEP_CHECK"
        
        # 检查关键依赖
        DB_STATUS=$(echo $DEEP_CHECK | jq -r '.database')
        if [ "$DB_STATUS" != "healthy" ]; then
          echo "Database is not healthy"
          exit 1
        fi
    
    - name: Run Smoke Tests
      run: |
        # 运行冒烟测试
        docker run --rm -v $(pwd):/app -w /app node:16-alpine sh -c "
          npm install -g newman
          newman run tests/smoke/postman_collection.json --environment tests/smoke/production.env
        "
    
    - name: Update Monitoring Dashboards
      run: |
        # 更新监控仪表板
        curl -X POST https://monitoring.example.com/api/v1/annotations \
          -H "Authorization: Bearer ${{ secrets.MONITORING_API_KEY }}" \
          -H "Content-Type: application/json" \
          -d '{
            "text": "Deployment completed successfully",
            "tags": ["deployment", "success"],
            "time": '$(date +%s)'
          }'
    
    - name: Notify Team
      if: success()
      run: |
        # 发送通知
        curl -X POST -H 'Content-type: application/json' \
          --data '{"text":"✅ Application deployed successfully to production. Health checks passed."}' \
          ${{ secrets.SLACK_WEBHOOK }}

3. 性能优化

缓存策略

yaml 复制代码

# .github/workflows/caching.yml
name: Optimized CI with Caching

on: [push, pull_request]

jobs:
  build-and-test:
    runs-on: ubuntu-latest
    
    steps:
    - uses: actions/checkout@v3
    
    - name: Cache Python dependencies
      uses: actions/cache@v3
      with:
        path: ~/.cache/pip
        key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements*.txt') }}
        restore-keys: |
          ${{ runner.os }}-pip-
    
    - name: Cache Docker layers
      uses: actions/cache@v3
      with:
        path: /tmp/.buildx-cache
        key: ${{ runner.os }}-buildx-${{ github.sha }}
        restore-keys: |
          ${{ runner.os }}-buildx-
    
    - name: Set up Docker Buildx
      uses: docker/setup-buildx-action@v2
    
    - name: Build Docker image with cache
      uses: docker/build-push-action@v4
      with:
        context: .
        push: false
        load: true
        tags: app:test
        cache-from: type=local,src=/tmp/.buildx-cache
        cache-to: type=local,dest=/tmp/.buildx-cache-new,mode=max
    
    - name: Move cache
      run: |
        rm -rf /tmp/.buildx-cache
        mv /tmp/.buildx-cache-new /tmp/.buildx-cache
    
    - name: Run tests
      run: |
        docker run --rm -v $(pwd):/app app:test python -m pytest

并行化策略

yaml 复制代码

# .github/workflows/parallelization.yml
name: Parallelized CI Pipeline

on: [push, pull_request]

jobs:
  split-tests:
    runs-on: ubuntu-latest
    outputs:
      test-files: ${{ steps.split.outputs.test-files }}
      total-parts: ${{ steps.split.outputs.total-parts }}
    
    steps:
    - uses: actions/checkout@v3
    
    - name: Split tests
      id: split
      run: |
        # 获取所有测试文件
        TEST_FILES=$(find tests/ -name "*.py" -type f | tr '\n' ' ')
        echo "test-files=$TEST_FILES" >> $GITHUB_OUTPUT
        
        # 根据文件数量确定分割部分数
        FILE_COUNT=$(echo $TEST_FILES | wc -w)
        if [ $FILE_COUNT -le 10 ]; then
          PARTS=2
        elif [ $FILE_COUNT -le 20 ]; then
          PARTS=4
        else
          PARTS=6
        fi
        echo "total-parts=$PARTS" >> $GITHUB_OUTPUT

  test-matrix:
    runs-on: ubuntu-latest
    needs: split-tests
    strategy:
      matrix:
        part: [1, 2, 3, 4, 5, 6]
      fail-fast: false
    
    steps:
    - uses: actions/checkout@v3
    
    - name: Set up Python
      uses: actions/setup-python@v4
      with:
        python-version: '3.10'
    
    - name: Install dependencies
      run: |
        python -m pip install --upgrade pip
        pip install -r requirements.txt
        pip install pytest pytest-xdist
    
    - name: Run tests for part ${{ matrix.part }}
      run: |
        # 使用 pytest-xdist 运行指定部分的测试
        TEST_FILES="${{ needs.split-tests.outputs.test-files }}"
        TOTAL_PARTS="${{ needs.split-tests.outputs.total-parts }}"
        
        # 使用 pytest --dist 分割测试
        pytest -n auto --dist=loadfile -k "test" --maxfail=5 \
          --junitxml=results-${{ matrix.part }}.xml
    
    - name: Upload test results
      uses: actions/upload-artifact@v3
      with:
        name: test-results-${{ matrix.part }}
        path: results-${{ matrix.part }}.xml

  merge-results:
    runs-on: ubuntu-latest
    needs: test-matrix
    if: always()
    
    steps:
    - uses: actions/download-artifact@v3
      with:
        pattern: test-results-*
        merge-multiple: true
    
    - name: Merge test reports
      run: |
        # 合并测试报告
        pip install junitparser
        python -c "
        import glob
        from junitparser import JUnitXml
        
        result = JUnitXml()
        for file in glob.glob('results-*.xml'):
            result += JUnitXml.fromfile(file)
        
        result.write('merged-results.xml')
        print('Merged test results from ${len(glob.glob(\"results-*.xml\"))} files')
        "
    
    - name: Publish test results
      uses: actions/upload-artifact@v3
      with:
        name: merged-test-results
        path: merged-results.xml
    
    - name: Report test status
      if: failure()
      run: |
        # 如果任何测试失败，则发送通知
        curl -X POST -H 'Content-type: application/json' \
          --data '{"text":"🚨 Tests failed in CI pipeline. Check the results for details."}' \
          ${{ secrets.SLACK_WEBHOOK }}

常见问题与解决方案

1. 资源限制问题

解决方案：配置资源限制和优化

yaml 复制代码

# .github/workflows/resource-optimization.yml
name: Resource Optimized CI

on: [push, pull_request]

jobs:
  optimized-build:
    runs-on: ubuntu-latest
    
    # 自定义运行器大小（如果使用自托管运行器）
    runs-on: [self-hosted, large]
    
    steps:
    - uses: actions/checkout@v3
    
    - name: Limit memory usage
      run: |
        # 限制内存使用的命令
        docker run --memory=2g --cpus=2.0 my-image
    
    - name: Free disk space
      run: |
        # 清理不必要的文件释放空间
        sudo rm -rf /usr/share/dotnet/
        sudo rm -rf /opt/ghc/
        sudo rm -rf "/usr/local/share/boost"
        sudo apt-get clean
        docker system prune -af

2. 并发执行问题

解决方案：环境隔离和资源锁定

yaml 复制代码

# .github/workflows/concurrency.yml
name: Concurrent Execution Management

on: [push]

jobs:
  deploy:
    runs-on: ubuntu-latest
    
    # 防止并发执行
    concurrency:
      group: deploy-${{ github.ref }}
      cancel-in-progress: true
    
    steps:
    - uses: actions/checkout@v3
    
    - name: Acquire deployment lock
      uses: softprops/turnstyle@v1
      env:
        GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
      with:
        same-branch-only: true
    
    - name: Deploy application
      run: |
        # 部署步骤
        echo "Deploying..."
        
    - name: Release deployment lock
      if: always()
      run: |
        # 释放部署锁
        echo "Deployment completed or failed, releasing lock"

3. 测试不稳定问题

解决方案：重试机制和测试隔离

yaml 复制代码

# .github/workflows/flaky-tests.yml
name: Handling Flaky Tests

on: [push, pull_request]

jobs:
  test-with-retry:
    runs-on: ubuntu-latest
    
    steps:
    - uses: actions/checkout@v3
    
    - name: Set up Python
      uses: actions/setup-python@v4
      with:
        python-version: '3.10'
    
    - name: Install dependencies
      run: |
        python -m pip install --upgrade pip
        pip install -r requirements.txt
        pip install pytest pytest-rerunfailures pytest-xdist
    
    - name: Run tests with retry
      run: |
        # 失败重试机制
        pytest --reruns=3 --reruns-delay=5 --maxfail=10 -n auto
    
    - name: Run tests in isolation
      run: |
        # 使用容器隔离测试环境
        docker-compose -f docker-compose.test.yml up --abort-on-container-exit --exit-code-from test
    
    - name: Identify flaky tests
      run: |
        # 分析测试结果，识别不稳定的测试
        python -c "
        import json
        import xml.etree.ElementTree as ET
        
        # 解析测试结果
        tree = ET.parse('test-results.xml')
        root = tree.getroot()
        
        # 查找失败的测试
        failed_tests = []
        for testcase in root.iter('testcase'):
            failure = testcase.find('failure')
            if failure is not None:
                failed_tests.append(testcase.get('name'))
        
        if failed_tests:
            print('Flaky tests detected:')
            for test in failed_tests:
                print(f'  - {test}')
            exit(1)
        "

4. 秘钥管理问题

解决方案：安全的秘钥管理和轮换

yaml 复制代码

# .github/workflows/secret-management.yml
name: Advanced Secret Management

on:
  schedule:
    - cron: '0 2 1 * *'  # 每月第一天凌晨2点运行
  workflow_dispatch:

jobs:
  rotate-secrets:
    runs-on: ubuntu-latest
    
    steps:
    - uses: actions/checkout@v3
    
    - name: Generate new secrets
      id: generate-secrets
      run: |
        # 生成新的API密钥
        NEW_API_KEY=$(openssl rand -hex 32)
        NEW_JWT_SECRET=$(openssl rand -base64 48)
        
        # 保存到输出变量
        echo "api-key=$NEW_API_KEY" >> $GITHUB_OUTPUT
        echo "jwt-secret=$NEW_JWT_SECRET" >> $GITHUB_OUTPUT
        
        # 保存到临时文件
        echo $NEW_API_KEY > new-api-key.txt
        echo $NEW_JWT_SECRET > new-jwt-secret.txt
    
    - name: Update application secrets
      run: |
        # 更新应用程序配置
        cat > src/config/secrets.json << EOF
        {
          "api_key": "${{ steps.generate-secrets.outputs.api-key }}",
          "jwt_secret": "${{ steps.generate-secrets.outputs.jwt-secret }}"
        }
        EOF
        
        # 提交更改
        git config --local user.email "action@github.com"
        git config --local user.name "GitHub Action"
        git add src/config/secrets.json
        git commit -m "Rotate secrets [skip ci]"
        git push
    
    - name: Update external services
      run: |
        # 更新外部服务中的密钥
        curl -X POST https://api.example.com/rotate-secrets \
          -H "Authorization: Bearer ${{ secrets.ADMIN_API_KEY }}" \
          -H "Content-Type: application/json" \
          -d '{
            "new_api_key": "${{ steps.generate-secrets.outputs.api_key }}",
            "new_jwt_secret": "${{ steps.generate-secrets.outputs.jwt-secret }}"
          }'
    
    - name: Update GitHub Secrets
      run: |
        # 更新GitHub仓库的密钥
        gh secret set API_KEY --body "${{ steps.generate-secrets.outputs.api-key }}"
        gh secret set JWT_SECRET --body "${{ steps.generate-secrets.outputs.jwt-secret }}"
      env:
        GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
    
    - name: Clean up temporary files
      if: always()
      run: |
        # 清理临时文件
        rm -f new-api-key.txt new-jwt-secret.txt

总结

CI/CD 是现代软件开发中不可或缺的一环，它通过自动化构建、测试和部署流程，大大提高了软件交付的速度和质量。本指南从基础概念到高级实践，涵盖了 CI/CD 配置的各个方面，包括：

选择合适的 CI/CD 工具
设计高效的流水线
实现多环境管理和部署策略
处理秘钥和安全问题
监控和优化流水线性能
解决常见问题

通过遵循这些最佳实践，团队可以建立稳定、可靠、高效的 CI/CD 流程，加速软件交付，同时保持高质量标准。

持续改进

CI/CD 不是一次性设置就完成的，而是一个持续改进的过程。建议团队定期：

分析流水线性能指标
收集团队反馈
评估新工具和技术
优化瓶颈步骤
更新安全实践

通过持续改进，CI/CD 流水线将不断适应项目和团队的发展需求，为软件开发提供稳定可靠的支撑。

CI/CD 配置完全指南：从零开始构建自动化流水线

目录

CI/CD 简介

什么是 CI/CD

CI/CD 的优势

CI/CD 工具选择

主流 CI/CD 工具对比

选择建议

基础环境配置

1. 版本控制设置

Git 配置

.gitignore 配置

2. 项目结构规范

CI 流水线设置

1. GitHub Actions 配置

创建工作流文件

2. GitLab CI/CD 配置

.gitlab-ci.yml 配置示例

3. Jenkins 流水线配置

Jenkinsfile 示例

CD 流水线设置

1. 部署策略

蓝绿部署

滚动更新

金丝雀发布

2. 基础设施即代码 (IaC)

Terraform 配置

Terraform CI/CD 集成

高级配置与最佳实践

1. 多环境管理

环境配置结构

多环境部署配置

2. 秘钥管理

使用 Kubernetes Secrets

外部秘密管理

3. 测试策略

多阶段测试

监控与优化

1. CI/CD 流水线监控

流水线性能监控

2. 应用监控集成

部署后健康检查

3. 性能优化

缓存策略

并行化策略

常见问题与解决方案

1. 资源限制问题

解决方案：配置资源限制和优化

2. 并发执行问题

解决方案：环境隔离和资源锁定

3. 测试不稳定问题

解决方案：重试机制和测试隔离

4. 秘钥管理问题

解决方案：安全的秘钥管理和轮换

总结

持续改进