
目录
- [CI/CD 简介](#CI/CD 简介)
- [CI/CD 工具选择](#CI/CD 工具选择)
- 基础环境配置
- [CI 流水线设置](#CI 流水线设置)
- [CD 流水线设置](#CD 流水线设置)
- 高级配置与最佳实践
- 监控与优化
- 常见问题与解决方案
CI/CD 简介
什么是 CI/CD
CI/CD 是持续集成 (Continuous Integration)、持续交付 (Continuous Delivery) 和持续部署 (Continuous Deployment) 的缩写,是一套通过自动化来频繁向客户交付应用的实践方法。
- 持续集成 (CI):开发人员频繁地合并代码到中央仓库,每次合并都会自动触发构建和测试。
- 持续交付 (CD):将通过测试的代码自动部署到预发布环境。
- 持续部署:将通过预发布环境验证的代码自动部署到生产环境。
CI/CD 的优势
- 减少手动错误
- 加快反馈周期
- 提高代码质量和发布频率
- 降低发布风险
- 增强团队协作效率
CI/CD 工具选择
主流 CI/CD 工具对比
| 工具 | 语言支持 | 托管方式 | 易用性 | 扩展性 | 成本 |
|---|---|---|---|---|---|
| Jenkins | 全部 | 自托管 | 中 | 高 | 免费 |
| GitLab CI/CD | 全部 | 自托管/SaaS | 高 | 中 | 免费/付费 |
| GitHub Actions | 全部 | SaaS | 高 | 高 | 免费/付费 |
| Travis CI | 多语言 | SaaS | 高 | 中 | 免费/付费 |
| CircleCI | 全部 | SaaS | 高 | 中 | 免费/付费 |
选择建议
- 小型项目/个人项目:GitHub Actions 或 GitLab CI/CD
- 企业级项目:Jenkins 或 GitLab CI/CD (自托管)
- 云原生应用:GitHub Actions 或云厂商提供的 CI/CD 服务
基础环境配置
1. 版本控制设置
Git 配置
bash
# 全局配置
git config --global user.name "Your Name"
git config --global user.email "your.email@example.com"
# 初始化仓库
git init
git add .
git commit -m "Initial commit"
# 连接远程仓库
git remote add origin https://github.com/username/repository.git
git push -u origin main
.gitignore 配置
gitignore
# Python
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
env/
venv/
.env
.venv
pip-log.txt
pip-delete-this-directory.txt
# Node.js
node_modules/
npm-debug.log*
yarn-debug.log*
yarn-error.log*
# IDE
.vscode/
.idea/
*.swp
*.swo
# OS
.DS_Store
Thumbs.db
2. 项目结构规范
project-root/
├── .github/
│ └── workflows/ # GitHub Actions 工作流
├── .gitlab-ci.yml # GitLab CI 配置
├── Jenkinsfile # Jenkins 流水线配置
├── src/ # 源代码
├── tests/ # 测试代码
├── docs/ # 文档
├── requirements.txt # Python 依赖
├── package.json # Node.js 依赖
├── Dockerfile # Docker 配置
└── docker-compose.yml # Docker Compose 配置
CI 流水线设置
1. GitHub Actions 配置
创建工作流文件
yaml
# .github/workflows/ci.yml
name: CI Pipeline
on:
push:
branches: [ main, develop ]
pull_request:
branches: [ main ]
jobs:
test:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: [3.8, 3.9, '3.10', '3.11']
steps:
- uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v3
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
pip install pytest pytest-cov
- name: Run tests with coverage
run: |
pytest --cov=./src --cov-report=xml
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v3
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v3
with:
python-version: '3.10'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install flake8 black
- name: Run flake8
run: |
flake8 src/ --count --select=E9,F63,F7,F82 --show-source --statistics
flake8 src/ --count --exit-zero --max-complexity=10 --max-line-length=88 --statistics
- name: Check code formatting with black
run: black --check src/
security:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Run security scan
uses: securecodewarrior/github-action-add-sarif@v1
with:
sarif-file: 'security-scan-results.sarif'
2. GitLab CI/CD 配置
.gitlab-ci.yml 配置示例
yaml
# .gitlab-ci.yml
stages:
- test
- build
- deploy
variables:
PYTHON_VERSION: "3.10"
PIP_CACHE_DIR: "$CI_PROJECT_DIR/.cache/pip"
cache:
paths:
- .cache/pip
- venv/
before_script:
- python -m venv venv
- source venv/bin/activate
- pip install --upgrade pip
- pip install -r requirements.txt
test:
stage: test
image: python:$PYTHON_VERSION
script:
- pip install pytest pytest-cov
- pytest --cov=src tests/
coverage: '/TOTAL.+?(\d+\%)$/'
artifacts:
reports:
coverage_report:
coverage_format: cobertura
path: coverage.xml
lint:
stage: test
image: python:$PYTHON_VERSION
script:
- pip install flake8 black
- flake8 src/
- black --check src/
security:
stage: test
image: python:$PYTHON_VERSION
script:
- pip install bandit
- bandit -r src/
build:
stage: build
image: docker:latest
services:
- docker:dind
script:
- docker build -t $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA .
- docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
- docker push $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
only:
- main
- develop
3. Jenkins 流水线配置
Jenkinsfile 示例
groovy
// Jenkinsfile
pipeline {
agent any
environment {
PYTHON_VERSION = '3.10'
DOCKER_REGISTRY = 'your-registry.com'
IMAGE_NAME = 'your-app'
}
stages {
stage('Checkout') {
steps {
checkout scm
}
}
stage('Setup Environment') {
steps {
sh 'python -m venv venv'
sh 'source venv/bin/activate && pip install --upgrade pip'
sh 'source venv/bin/activate && pip install -r requirements.txt'
}
}
stage('Lint') {
steps {
sh 'source venv/bin/activate && pip install flake8 black'
sh 'source venv/bin/activate && flake8 src/'
sh 'source venv/bin/activate && black --check src/'
}
}
stage('Test') {
steps {
sh 'source venv/bin/activate && pip install pytest pytest-cov'
sh 'source venv/bin/activate && pytest --cov=src tests/'
}
post {
always {
publishHTML([
allowMissing: false,
alwaysLinkToLastBuild: true,
keepAll: true,
reportDir: 'htmlcov',
reportFiles: 'index.html',
reportName: 'Coverage Report'
])
}
}
}
stage('Security Scan') {
steps {
sh 'source venv/bin/activate && pip install bandit'
sh 'source venv/bin/activate && bandit -r src/'
}
}
stage('Build Docker Image') {
when {
branch 'main'
}
steps {
script {
def image = docker.build("${env.DOCKER_REGISTRY}/${env.IMAGE_NAME}:${env.BUILD_NUMBER}")
docker.withRegistry("https://${env.DOCKER_REGISTRY}", 'docker-registry-credentials') {
image.push()
image.push('latest')
}
}
}
}
}
post {
always {
cleanWs()
}
success {
echo 'Pipeline succeeded!'
}
failure {
echo 'Pipeline failed!'
emailext (
subject: "Failed Pipeline: ${env.JOB_NAME} - ${env.BUILD_NUMBER}",
body: "Check console output at ${env.BUILD_URL}",
to: "${env.CHANGE_AUTHOR_EMAIL}"
)
}
}
}
CD 流水线设置
1. 部署策略
蓝绿部署
yaml
# .github/workflows/deploy-blue-green.yml
name: Blue-Green Deployment
on:
workflow_run:
workflows: ["CI Pipeline"]
types:
- completed
branches: [main]
jobs:
deploy:
runs-on: ubuntu-latest
if: ${{ github.event.workflow_run.conclusion == 'success' }}
steps:
- uses: actions/checkout@v3
- name: Setup kubectl
uses: azure/setup-kubectl@v1
with:
version: 'v1.24.0'
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v1
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: us-east-1
- name: Deploy to Blue Environment
run: |
# 更新蓝色环境
kubectl set image deployment/app-blue app=${{ github.event.workflow_run.head_sha }} --namespace=production
# 等待蓝色环境就绪
kubectl rollout status deployment/app-blue --namespace=production
# 运行健康检查
kubectl run health-check --image=curlimages/curl --rm -i --restart=Never -- \
curl -f http://app-blue-service/health
- name: Switch Traffic to Blue
run: |
# 切换流量到蓝色环境
kubectl patch service app-service -p '{"spec":{"selector":{"version":"blue"}}}' --namespace=production
滚动更新
yaml
# .github/workflows/rolling-update.yml
name: Rolling Update Deployment
on:
push:
branches: [main]
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Setup Helm
uses: azure/setup-helm@v1
with:
version: 'v3.8.0'
- name: Deploy with Helm
run: |
helm upgrade --install app ./helm-chart \
--set image.tag=${{ github.sha }} \
--set image.repository=${{ secrets.DOCKER_REGISTRY }}/app \
--namespace=production \
--wait \
--timeout=10m
金丝雀发布
yaml
# .github/workflows/canary-deployment.yml
name: Canary Deployment
on:
push:
branches: [main]
jobs:
deploy-canary:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Deploy Canary Version
run: |
# 部署金丝雀版本 (10% 流量)
kubectl apply -f k8s/canary-deployment.yaml
# 配置流量分割
kubectl apply -f k8s/canary-service.yaml
- name: Monitor Canary
run: |
# 监控金丝雀版本健康状况
for i in {1..10}; do
STATUS=$(kubectl get canary app-canary -o jsonpath='{.status.phase}')
if [ "$STATUS" == "Progressing" ]; then
echo "Canary deployment is progressing..."
sleep 30
elif [ "$STATUS" == "Healthy" ]; then
echo "Canary is healthy, promoting to full rollout"
break
else
echo "Canary failed, rolling back"
kubectl delete -f k8s/canary-deployment.yaml
exit 1
fi
done
- name: Promote to Full Rollout
run: |
# 推广到全量发布
kubectl patch deployment app -p '{"spec":{"template":{"spec":{"containers":[{"name":"app","image":"'${{ secrets.DOCKER_REGISTRY }}/app:${{ github.sha }}'"}]}}}}'
kubectl delete -f k8s/canary-deployment.yaml
2. 基础设施即代码 (IaC)
Terraform 配置
hcl
# terraform/main.tf
provider "aws" {
region = var.aws_region
}
# VPC 配置
resource "aws_vpc" "main" {
cidr_block = "10.0.0.0/16"
enable_dns_hostnames = true
enable_dns_support = true
tags = {
Name = "main-vpc"
}
}
# EKS 集群
resource "aws_eks_cluster" "main" {
name = var.cluster_name
role_arn = aws_iam_role.eks_cluster.arn
version = "1.24"
vpc_config {
subnet_ids = concat(
aws_subnet.public.*.id,
aws_subnet.private.*.id
)
}
depends_on = [
aws_iam_role_policy_attachment.eks_cluster_policy,
]
}
# 应用部署
resource "kubernetes_deployment" "app" {
metadata {
name = "app"
labels = {
app = "my-app"
}
}
spec {
replicas = 3
selector {
match_labels = {
app = "my-app"
}
}
template {
metadata {
labels = {
app = "my-app"
}
}
spec {
container {
image = "${var.docker_registry}/app:${var.image_tag}"
name = "app"
port {
container_port = 8080
}
resources {
limits = {
cpu = "0.5"
memory = "512Mi"
}
requests = {
cpu = "250m"
memory = "50Mi"
}
}
}
}
}
}
depends_on = [
aws_eks_cluster.main,
]
}
# 服务暴露
resource "kubernetes_service" "app" {
metadata {
name = "app-service"
}
spec {
selector = {
app = kubernetes_deployment.app.metadata[0].labels.app
}
port {
protocol = "TCP"
port = 80
target_port = 8080
}
type = "LoadBalancer"
}
depends_on = [
kubernetes_deployment.app,
]
}
Terraform CI/CD 集成
yaml
# .github/workflows/terraform.yml
name: Terraform CI/CD
on:
push:
paths:
- 'terraform/**'
branches: [main]
pull_request:
paths:
- 'terraform/**'
branches: [main]
jobs:
terraform:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Setup Terraform
uses: hashicorp/setup-terraform@v1
with:
terraform_version: 1.3.0
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v1
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: us-east-1
- name: Terraform Format Check
run: terraform fmt -check
working-directory: ./terraform
- name: Terraform Init
run: terraform init
working-directory: ./terraform
- name: Terraform Validate
run: terraform validate
working-directory: ./terraform
- name: Terraform Plan
if: github.event_name == 'pull_request'
run: terraform plan -out=plan.tfplan
working-directory: ./terraform
- name: Terraform Apply
if: github.ref == 'refs/heads/main' && github.event_name == 'push'
run: terraform apply -auto-approve
working-directory: ./terraform
高级配置与最佳实践
1. 多环境管理
环境配置结构
configs/
├── environments/
│ ├── development/
│ │ ├── variables.yml
│ │ └── config.yml
│ ├── staging/
│ │ ├── variables.yml
│ │ └── config.yml
│ └── production/
│ ├── variables.yml
│ └── config.yml
└── templates/
├── docker-compose.yml.j2
└── k8s-deployment.yml.j2
多环境部署配置
yaml
# .github/workflows/multi-env-deploy.yml
name: Multi-Environment Deployment
on:
workflow_dispatch:
inputs:
environment:
description: 'Target environment'
required: true
default: 'staging'
type: choice
options:
- development
- staging
- production
env:
REGISTRY: ghcr.io
IMAGE_NAME: ${{ github.repository }}
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Log in to Container Registry
uses: docker/login-action@v2
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Extract metadata
id: meta
uses: docker/metadata-action@v4
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
- name: Build and push Docker image
uses: docker/build-push-action@v4
with:
context: .
push: true
tags: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }}
labels: ${{ steps.meta.outputs.labels }}
deploy:
needs: build
runs-on: ubuntu-latest
environment: ${{ github.event.inputs.environment }}
steps:
- uses: actions/checkout@v3
- name: Deploy to ${{ github.event.inputs.environment }}
run: |
# 使用环境特定配置
echo "Deploying to ${{ github.event.inputs.environment }}"
# 设置环境变量
if [ "${{ github.event.inputs.environment }}" == "production" ]; then
NAMESPACE="production"
REPLICA_COUNT="5"
CPU_LIMIT="1000m"
MEMORY_LIMIT="2Gi"
elif [ "${{ github.event.inputs.environment }}" == "staging" ]; then
NAMESPACE="staging"
REPLICA_COUNT="3"
CPU_LIMIT="500m"
MEMORY_LIMIT="1Gi"
else
NAMESPACE="development"
REPLICA_COUNT="1"
CPU_LIMIT="200m"
MEMORY_LIMIT="512Mi"
fi
# 应用配置
envsubst < k8s/deployment.yml.template | kubectl apply -f -
# 等待部署完成
kubectl rollout status deployment/app --namespace=$NAMESPACE
env:
IMAGE_TAG: ${{ github.sha }}
NAMESPACE: ${{ github.event.inputs.environment }}
REPLICA_COUNT: ${{ vars.REPLICA_COUNT }}
CPU_LIMIT: ${{ vars.CPU_LIMIT }}
MEMORY_LIMIT: ${{ vars.MEMORY_LIMIT }}
2. 秘钥管理
使用 Kubernetes Secrets
yaml
# k8s/secrets.yml
apiVersion: v1
kind: Secret
metadata:
name: app-secrets
type: Opaque
data:
database-url: <base64-encoded-database-url>
api-key: <base64-encoded-api-key>
jwt-secret: <base64-encoded-jwt-secret>
外部秘密管理
yaml
# .github/workflows/secrets-management.yml
name: Secrets Management
on: push
jobs:
manage-secrets:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Setup Vault
uses: hashicorp/vault-action@v2.4.0
with:
url: ${{ secrets.VAULT_URL }}
token: ${{ secrets.VAULT_TOKEN }}
secrets: |
secret/data/app/database DATABASE_URL;
secret/data/app/api API_KEY;
secret/data/app/jwt JWT_SECRET;
- name: Apply Kubernetes Secrets
run: |
# 创建临时密钥文件
echo $DATABASE_URL | base64 > db-url.b64
echo $API_KEY | base64 > api-key.b64
echo $JWT_SECRET | base64 > jwt-secret.b64
# 应用 Kubernetes 密钥
kubectl apply -f - <<EOF
apiVersion: v1
kind: Secret
metadata:
name: app-secrets
type: Opaque
data:
database-url: $(cat db-url.b64)
api-key: $(cat api-key.b64)
jwt-secret: $(cat jwt-secret.b64)
EOF
# 清理临时文件
rm -f *.b64
3. 测试策略
多阶段测试
yaml
# .github/workflows/testing.yml
name: Testing Strategy
on: [push, pull_request]
jobs:
unit-tests:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.10'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements-test.txt
- name: Run unit tests
run: |
pytest tests/unit/ -v --cov=src --cov-report=xml --cov-report=html
- name: Upload coverage reports
uses: codecov/codecov-action@v3
with:
file: ./coverage.xml
integration-tests:
runs-on: ubuntu-latest
needs: unit-tests
services:
postgres:
image: postgres:13
env:
POSTGRES_PASSWORD: postgres
POSTGRES_DB: test_db
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
ports:
- 5432:5432
redis:
image: redis:6
options: >-
--health-cmd "redis-cli ping"
--health-interval 10s
--health-timeout 5s
--health-retries 5
ports:
- 6379:6379
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.10'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements-test.txt
- name: Wait for services
run: |
sleep 10
- name: Run integration tests
env:
DATABASE_URL: postgresql://postgres:postgres@localhost:5432/test_db
REDIS_URL: redis://localhost:6379/0
run: |
pytest tests/integration/ -v
e2e-tests:
runs-on: ubuntu-latest
needs: integration-tests
steps:
- uses: actions/checkout@v3
- name: Set up Node.js
uses: actions/setup-node@v3
with:
node-version: '16'
cache: 'npm'
cache-dependency-path: e2e/package-lock.json
- name: Install dependencies
run: |
cd e2e
npm ci
- name: Start application
run: |
docker-compose up -d
sleep 30
- name: Run E2E tests
run: |
cd e2e
npm run test:headless
- name: Stop application
if: always()
run: |
docker-compose down
监控与优化
1. CI/CD 流水线监控
流水线性能监控
yaml
# .github/workflows/monitoring.yml
name: Pipeline Monitoring
on:
workflow_dispatch:
schedule:
- cron: '0 6 * * 1' # 每周一早上6点运行
jobs:
analyze-performance:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Fetch workflow runs
uses: actions/github-script@v6
id: fetch-runs
with:
script: |
const runs = await github.rest.actions.listWorkflowRuns({
owner: context.repo.owner,
repo: context.repo.repo,
workflow_id: 'ci.yml',
per_page: 30,
});
return runs.data.workflow_runs;
- name: Analyze performance data
run: |
# 分析流水线执行时间
node -e "
const runs = JSON.parse('${{ steps.fetch-runs.outputs.result }}');
const avgDuration = runs.reduce((sum, run) => sum + (run.updated_at - run.created_at), 0) / runs.length;
console.log('Average pipeline duration:', Math.round(avgDuration / 1000), 'seconds');
"
- name: Generate performance report
run: |
# 生成性能报告
cat > performance-report.md << EOF
# CI/CD Pipeline Performance Report
Generated on: $(date)
## Metrics
- Average pipeline duration: ${AVG_DURATION} seconds
- Success rate: ${SUCCESS_RATE}%
- Average test execution time: ${TEST_TIME} seconds
## Recommendations
1. Consider caching dependencies to reduce build time
2. Optimize test suite execution
3. Review long-running steps for optimization
EOF
# 上传报告
gh release create performance-report-$(date +%Y%m%d) ./performance-report.md --title "Performance Report $(date +%Y-%m-%d)" --prerelease
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
2. 应用监控集成
部署后健康检查
yaml
# .github/workflows/health-check.yml
name: Post-Deployment Health Check
on:
workflow_run:
workflows: ["Deploy to Production"]
types:
- completed
branches: [main]
jobs:
health-check:
runs-on: ubuntu-latest
if: ${{ github.event.workflow_run.conclusion == 'success' }}
steps:
- uses: actions/checkout@v3
- name: Health Check
run: |
# 基本健康检查
STATUS_CODE=$(curl -s -o /dev/null -w "%{http_code}" https://api.example.com/health)
if [ "$STATUS_CODE" -ne 200 ]; then
echo "Health check failed with status code: $STATUS_CODE"
exit 1
fi
# 深度健康检查
DEEP_CHECK=$(curl -s https://api.example.com/health/deep)
echo "Deep health check result: $DEEP_CHECK"
# 检查关键依赖
DB_STATUS=$(echo $DEEP_CHECK | jq -r '.database')
if [ "$DB_STATUS" != "healthy" ]; then
echo "Database is not healthy"
exit 1
fi
- name: Run Smoke Tests
run: |
# 运行冒烟测试
docker run --rm -v $(pwd):/app -w /app node:16-alpine sh -c "
npm install -g newman
newman run tests/smoke/postman_collection.json --environment tests/smoke/production.env
"
- name: Update Monitoring Dashboards
run: |
# 更新监控仪表板
curl -X POST https://monitoring.example.com/api/v1/annotations \
-H "Authorization: Bearer ${{ secrets.MONITORING_API_KEY }}" \
-H "Content-Type: application/json" \
-d '{
"text": "Deployment completed successfully",
"tags": ["deployment", "success"],
"time": '$(date +%s)'
}'
- name: Notify Team
if: success()
run: |
# 发送通知
curl -X POST -H 'Content-type: application/json' \
--data '{"text":"✅ Application deployed successfully to production. Health checks passed."}' \
${{ secrets.SLACK_WEBHOOK }}
3. 性能优化
缓存策略
yaml
# .github/workflows/caching.yml
name: Optimized CI with Caching
on: [push, pull_request]
jobs:
build-and-test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Cache Python dependencies
uses: actions/cache@v3
with:
path: ~/.cache/pip
key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements*.txt') }}
restore-keys: |
${{ runner.os }}-pip-
- name: Cache Docker layers
uses: actions/cache@v3
with:
path: /tmp/.buildx-cache
key: ${{ runner.os }}-buildx-${{ github.sha }}
restore-keys: |
${{ runner.os }}-buildx-
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
- name: Build Docker image with cache
uses: docker/build-push-action@v4
with:
context: .
push: false
load: true
tags: app:test
cache-from: type=local,src=/tmp/.buildx-cache
cache-to: type=local,dest=/tmp/.buildx-cache-new,mode=max
- name: Move cache
run: |
rm -rf /tmp/.buildx-cache
mv /tmp/.buildx-cache-new /tmp/.buildx-cache
- name: Run tests
run: |
docker run --rm -v $(pwd):/app app:test python -m pytest
并行化策略
yaml
# .github/workflows/parallelization.yml
name: Parallelized CI Pipeline
on: [push, pull_request]
jobs:
split-tests:
runs-on: ubuntu-latest
outputs:
test-files: ${{ steps.split.outputs.test-files }}
total-parts: ${{ steps.split.outputs.total-parts }}
steps:
- uses: actions/checkout@v3
- name: Split tests
id: split
run: |
# 获取所有测试文件
TEST_FILES=$(find tests/ -name "*.py" -type f | tr '\n' ' ')
echo "test-files=$TEST_FILES" >> $GITHUB_OUTPUT
# 根据文件数量确定分割部分数
FILE_COUNT=$(echo $TEST_FILES | wc -w)
if [ $FILE_COUNT -le 10 ]; then
PARTS=2
elif [ $FILE_COUNT -le 20 ]; then
PARTS=4
else
PARTS=6
fi
echo "total-parts=$PARTS" >> $GITHUB_OUTPUT
test-matrix:
runs-on: ubuntu-latest
needs: split-tests
strategy:
matrix:
part: [1, 2, 3, 4, 5, 6]
fail-fast: false
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.10'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
pip install pytest pytest-xdist
- name: Run tests for part ${{ matrix.part }}
run: |
# 使用 pytest-xdist 运行指定部分的测试
TEST_FILES="${{ needs.split-tests.outputs.test-files }}"
TOTAL_PARTS="${{ needs.split-tests.outputs.total-parts }}"
# 使用 pytest --dist 分割测试
pytest -n auto --dist=loadfile -k "test" --maxfail=5 \
--junitxml=results-${{ matrix.part }}.xml
- name: Upload test results
uses: actions/upload-artifact@v3
with:
name: test-results-${{ matrix.part }}
path: results-${{ matrix.part }}.xml
merge-results:
runs-on: ubuntu-latest
needs: test-matrix
if: always()
steps:
- uses: actions/download-artifact@v3
with:
pattern: test-results-*
merge-multiple: true
- name: Merge test reports
run: |
# 合并测试报告
pip install junitparser
python -c "
import glob
from junitparser import JUnitXml
result = JUnitXml()
for file in glob.glob('results-*.xml'):
result += JUnitXml.fromfile(file)
result.write('merged-results.xml')
print('Merged test results from ${len(glob.glob(\"results-*.xml\"))} files')
"
- name: Publish test results
uses: actions/upload-artifact@v3
with:
name: merged-test-results
path: merged-results.xml
- name: Report test status
if: failure()
run: |
# 如果任何测试失败,则发送通知
curl -X POST -H 'Content-type: application/json' \
--data '{"text":"🚨 Tests failed in CI pipeline. Check the results for details."}' \
${{ secrets.SLACK_WEBHOOK }}
常见问题与解决方案
1. 资源限制问题
解决方案:配置资源限制和优化
yaml
# .github/workflows/resource-optimization.yml
name: Resource Optimized CI
on: [push, pull_request]
jobs:
optimized-build:
runs-on: ubuntu-latest
# 自定义运行器大小(如果使用自托管运行器)
runs-on: [self-hosted, large]
steps:
- uses: actions/checkout@v3
- name: Limit memory usage
run: |
# 限制内存使用的命令
docker run --memory=2g --cpus=2.0 my-image
- name: Free disk space
run: |
# 清理不必要的文件释放空间
sudo rm -rf /usr/share/dotnet/
sudo rm -rf /opt/ghc/
sudo rm -rf "/usr/local/share/boost"
sudo apt-get clean
docker system prune -af
2. 并发执行问题
解决方案:环境隔离和资源锁定
yaml
# .github/workflows/concurrency.yml
name: Concurrent Execution Management
on: [push]
jobs:
deploy:
runs-on: ubuntu-latest
# 防止并发执行
concurrency:
group: deploy-${{ github.ref }}
cancel-in-progress: true
steps:
- uses: actions/checkout@v3
- name: Acquire deployment lock
uses: softprops/turnstyle@v1
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
same-branch-only: true
- name: Deploy application
run: |
# 部署步骤
echo "Deploying..."
- name: Release deployment lock
if: always()
run: |
# 释放部署锁
echo "Deployment completed or failed, releasing lock"
3. 测试不稳定问题
解决方案:重试机制和测试隔离
yaml
# .github/workflows/flaky-tests.yml
name: Handling Flaky Tests
on: [push, pull_request]
jobs:
test-with-retry:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.10'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
pip install pytest pytest-rerunfailures pytest-xdist
- name: Run tests with retry
run: |
# 失败重试机制
pytest --reruns=3 --reruns-delay=5 --maxfail=10 -n auto
- name: Run tests in isolation
run: |
# 使用容器隔离测试环境
docker-compose -f docker-compose.test.yml up --abort-on-container-exit --exit-code-from test
- name: Identify flaky tests
run: |
# 分析测试结果,识别不稳定的测试
python -c "
import json
import xml.etree.ElementTree as ET
# 解析测试结果
tree = ET.parse('test-results.xml')
root = tree.getroot()
# 查找失败的测试
failed_tests = []
for testcase in root.iter('testcase'):
failure = testcase.find('failure')
if failure is not None:
failed_tests.append(testcase.get('name'))
if failed_tests:
print('Flaky tests detected:')
for test in failed_tests:
print(f' - {test}')
exit(1)
"
4. 秘钥管理问题
解决方案:安全的秘钥管理和轮换
yaml
# .github/workflows/secret-management.yml
name: Advanced Secret Management
on:
schedule:
- cron: '0 2 1 * *' # 每月第一天凌晨2点运行
workflow_dispatch:
jobs:
rotate-secrets:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Generate new secrets
id: generate-secrets
run: |
# 生成新的API密钥
NEW_API_KEY=$(openssl rand -hex 32)
NEW_JWT_SECRET=$(openssl rand -base64 48)
# 保存到输出变量
echo "api-key=$NEW_API_KEY" >> $GITHUB_OUTPUT
echo "jwt-secret=$NEW_JWT_SECRET" >> $GITHUB_OUTPUT
# 保存到临时文件
echo $NEW_API_KEY > new-api-key.txt
echo $NEW_JWT_SECRET > new-jwt-secret.txt
- name: Update application secrets
run: |
# 更新应用程序配置
cat > src/config/secrets.json << EOF
{
"api_key": "${{ steps.generate-secrets.outputs.api-key }}",
"jwt_secret": "${{ steps.generate-secrets.outputs.jwt-secret }}"
}
EOF
# 提交更改
git config --local user.email "action@github.com"
git config --local user.name "GitHub Action"
git add src/config/secrets.json
git commit -m "Rotate secrets [skip ci]"
git push
- name: Update external services
run: |
# 更新外部服务中的密钥
curl -X POST https://api.example.com/rotate-secrets \
-H "Authorization: Bearer ${{ secrets.ADMIN_API_KEY }}" \
-H "Content-Type: application/json" \
-d '{
"new_api_key": "${{ steps.generate-secrets.outputs.api_key }}",
"new_jwt_secret": "${{ steps.generate-secrets.outputs.jwt-secret }}"
}'
- name: Update GitHub Secrets
run: |
# 更新GitHub仓库的密钥
gh secret set API_KEY --body "${{ steps.generate-secrets.outputs.api-key }}"
gh secret set JWT_SECRET --body "${{ steps.generate-secrets.outputs.jwt-secret }}"
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
- name: Clean up temporary files
if: always()
run: |
# 清理临时文件
rm -f new-api-key.txt new-jwt-secret.txt
总结
CI/CD 是现代软件开发中不可或缺的一环,它通过自动化构建、测试和部署流程,大大提高了软件交付的速度和质量。本指南从基础概念到高级实践,涵盖了 CI/CD 配置的各个方面,包括:
- 选择合适的 CI/CD 工具
- 设计高效的流水线
- 实现多环境管理和部署策略
- 处理秘钥和安全问题
- 监控和优化流水线性能
- 解决常见问题
通过遵循这些最佳实践,团队可以建立稳定、可靠、高效的 CI/CD 流程,加速软件交付,同时保持高质量标准。
持续改进
CI/CD 不是一次性设置就完成的,而是一个持续改进的过程。建议团队定期:
- 分析流水线性能指标
- 收集团队反馈
- 评估新工具和技术
- 优化瓶颈步骤
- 更新安全实践
通过持续改进,CI/CD 流水线将不断适应项目和团队的发展需求,为软件开发提供稳定可靠的支撑。