一、架构设计
1. 技术栈
-
GitLab: 代码仓库与CI/CD管理
-
GitLab Runner: 执行CI/CD流水线
-
Docker: 容器化运行时
-
Docker Registry: 私有镜像仓库(GitLab Container Registry)
-
Spring Cloud Alibaba: 微服务框架
-
Nacos: 服务注册与配置中心
-
Sentinel: 流量控制
-
Seata: 分布式事务
2. 分支策略
text
main/master → 生产环境 (production)
release/* → 预发布环境 (staging)
develop → 测试环境 (testing)
feature/* → 开发环境 (development)
hotfix/* → 热修复环境
3. 环境拓扑
text
开发环境 (development): 单节点部署,使用本地Nacos
测试环境 (testing): 完整微服务集群,独立Nacos
预发布环境 (staging): 与生产环境1:1配置
生产环境 (production): 高可用集群,多AZ部署
二、项目结构配置
1. Spring Cloud Alibaba 多环境配置
父pom.xml:
xml
<properties>
<spring-boot.version>2.7.15</spring-boot.version>
<spring-cloud.version>2021.0.8</spring-cloud.version>
<spring-cloud-alibaba.version>2021.0.5.0</spring-cloud-alibaba.version>
<docker.image.prefix>registry.gitlab.com/your-group</docker.image.prefix>
</properties>
<dependencyManagement>
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-dependencies</artifactId>
<version>${spring-boot.version}</version>
<type>pom</type>
<scope>import</scope>
</dependency>
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-dependencies</artifactId>
<version>${spring-cloud.version}</version>
<type>pom</type>
<scope>import</scope>
</dependency>
<dependency>
<groupId>com.alibaba.cloud</groupId>
<artifactId>spring-cloud-alibaba-dependencies</artifactId>
<version>${spring-cloud-alibaba.version}</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>
<profiles>
<profile>
<id>development</id>
<properties>
<activatedProperties>dev</activatedProperties>
</properties>
</profile>
<profile>
<id>testing</id>
<properties>
<activatedProperties>test</activatedProperties>
</properties>
</profile>
<profile>
<id>staging</id>
<properties>
<activatedProperties>staging</activatedProperties>
</properties>
</profile>
<profile>
<id>production</id>
<properties>
<activatedProperties>prod</activatedProperties>
</properties>
</profile>
</profiles>
bootstrap.yml (微服务通用配置):
yaml
spring:
application:
name: @project.artifactId@
profiles:
active: @activatedProperties@
cloud:
nacos:
config:
server-addr: ${NACOS_SERVER:localhost:8848}
namespace: ${NACOS_NAMESPACE:public}
group: ${NACOS_GROUP:DEFAULT_GROUP}
file-extension: yaml
shared-configs[0]:
data-id: common-${spring.profiles.active}.yaml
refresh: true
discovery:
server-addr: ${NACOS_SERVER:localhost:8848}
namespace: ${NACOS_NAMESPACE:public}
group: ${NACOS_GROUP:DEFAULT_GROUP}
各环境Nacos配置:
common-dev.yaml (开发环境):
yaml
# 开发环境配置
spring:
cloud:
sentinel:
transport:
dashboard: localhost:8080
eager: true
seata:
enabled: false
logging:
level:
com.alibaba.nacos: WARN
root: INFO
common-prod.yaml (生产环境):
yaml
# 生产环境配置
spring:
cloud:
sentinel:
transport:
dashboard: sentinel-dashboard:8080
port: 8719
eager: true
datasource:
ds1:
nacos:
server-addr: ${spring.cloud.nacos.config.server-addr}
dataId: ${spring.application.name}-sentinel
groupId: SENTINEL_GROUP
rule-type: flow
seata:
enabled: true
tx-service-group: ${spring.application.name}-tx-group
service:
vgroup-mapping:
${spring.application.name}-tx-group: default
disable-global-transaction: false
config:
type: nacos
nacos:
server-addr: ${spring.cloud.nacos.config.server-addr}
namespace: ${spring.cloud.nacos.config.namespace}
group: SEATA_GROUP
registry:
type: nacos
nacos:
server-addr: ${spring.cloud.nacos.config.server-addr}
namespace: ${spring.cloud.nacos.config.namespace}
group: SEATA_GROUP
management:
endpoints:
web:
exposure:
include: health,info,metrics,prometheus
metrics:
export:
prometheus:
enabled: true
2. Dockerfile配置
通用Dockerfile:
dockerfile
# 构建阶段
FROM maven:3.8.6-openjdk-11-slim AS builder
WORKDIR /app
# 复制POM文件
COPY pom.xml .
# 下载依赖
RUN mvn dependency:go-offline -B
# 复制源代码
COPY src ./src
# 构建应用
RUN mvn clean package -DskipTests -P${BUILD_PROFILE}
# 运行时阶段
FROM openjdk:11-jre-slim
LABEL maintainer="devops@company.com"
# 创建应用用户
RUN groupadd -r spring && useradd -r -g spring spring
USER spring:spring
WORKDIR /app
# 复制JAR文件
COPY --from=builder /app/target/*.jar app.jar
# 健康检查
HEALTHCHECK --interval=30s --timeout=3s --start-period=60s --retries=3 \
CMD curl -f http://localhost:${SERVER_PORT:-8080}/actuator/health || exit 1
# 暴露端口
EXPOSE ${SERVER_PORT:-8080}
# 启动应用
ENTRYPOINT ["java", \
"-Djava.security.egd=file:/dev/./urandom", \
"-Dspring.profiles.active=${SPRING_PROFILES_ACTIVE}", \
"-Dserver.port=${SERVER_PORT:-8080}", \
"-jar", \
"app.jar"]
Dockerfile-nacos (如果需要自定义Nacos镜像):
dockerfile
FROM nacos/nacos-server:v2.2.3
# 复制自定义配置
COPY conf/custom.properties /home/nacos/conf/
COPY scripts/startup.sh /home/nacos/bin/
# 修改启动脚本权限
RUN chmod +x /home/nacos/bin/startup.sh
EXPOSE 8848 9848 9849
三、GitLab Runner配置
1. Docker Runner注册
bash
# 安装GitLab Runner
curl -L "https://packages.gitlab.com/install/repositories/runner/gitlab-runner/script.deb.sh" | sudo bash
sudo apt-get install gitlab-runner
# 注册Docker Runner
sudo gitlab-runner register \
--url "https://gitlab.example.com" \
--registration-token "PROJECT_REGISTRATION_TOKEN" \
--description "docker-runner" \
--executor "docker" \
--docker-image "docker:20.10.16" \
--docker-volumes "/var/run/docker.sock:/var/run/docker.sock" \
--docker-volumes "/cache" \
--docker-privileged \
--tag-list "docker,spring-cloud" \
--run-untagged="false"
# 配置Runner
sudo vim /etc/gitlab-runner/config.toml
config.toml配置:
toml
concurrent = 4
check_interval = 0
[session_server]
session_timeout = 1800
[[runners]]
name = "docker-runner"
url = "https://gitlab.example.com"
token = "TOKEN"
executor = "docker"
[runners.custom_build_dir]
[runners.cache]
[runners.cache.s3]
[runners.cache.gcs]
[runners.cache.azure]
[runners.docker]
tls_verify = false
image = "docker:20.10.16"
privileged = true
disable_entrypoint_overwrite = false
oom_kill_disable = false
disable_cache = false
volumes = ["/var/run/docker.sock:/var/run/docker.sock", "/cache", "/builds:/builds"]
shm_size = 0
pull_policy = "if-not-present"
2. 配置GitLab Container Registry
在GitLab项目的 Settings > CI/CD > Variables 中设置:
-
CI_REGISTRY:registry.gitlab.com -
CI_REGISTRY_IMAGE:registry.gitlab.com/your-group/your-project -
CI_REGISTRY_USER:gitlab-ci-token -
CI_REGISTRY_PASSWORD:$CI_JOB_TOKEN(自动)
四、GitLab CI/CD配置
.gitlab-ci.yml:
yaml
# 全局变量
variables:
DOCKER_DRIVER: overlay2
MAVEN_OPTS: "-Dmaven.repo.local=$CI_PROJECT_DIR/.m2/repository"
DOCKER_HOST: tcp://docker:2375
# 镜像标签策略
IMAGE_TAG_COMMIT: $CI_COMMIT_SHORT_SHA
IMAGE_TAG_BRANCH: $CI_COMMIT_REF_SLUG
IMAGE_TAG_LATEST: latest
# 缓存配置
cache:
key: "${CI_COMMIT_REF_SLUG}"
paths:
- .m2/repository/
- target/
# 阶段定义
stages:
- build
- test
- sonarqube-check
- build-image
- push-image
- deploy-development
- deploy-testing
- deploy-staging
- deploy-production
- notify
# 1. 构建阶段
build-services:
stage: build
image: maven:3.8.6-openjdk-11-slim
script:
- echo "Building all Spring Cloud services..."
- mvn clean compile -U -B
artifacts:
paths:
- "**/target/*.jar"
expire_in: 1 hour
only:
- branches
tags:
- docker
# 2. 单元测试
unit-test:
stage: test
image: maven:3.8.6-openjdk-11-slim
script:
- echo "Running unit tests..."
- mvn test -B
artifacts:
reports:
junit:
- "**/target/surefire-reports/TEST-*.xml"
- "**/target/failsafe-reports/TEST-*.xml"
only:
- merge_requests
- develop
- main
tags:
- docker
# 3. SonarQube代码质量检查
sonarqube-check:
stage: sonarqube-check
image: maven:3.8.6-openjdk-11-slim
variables:
SONAR_USER_HOME: "${CI_PROJECT_DIR}/.sonar"
GIT_DEPTH: "0"
cache:
key: "${CI_JOB_NAME}"
paths:
- .sonar/cache
script:
- mvn verify sonar:sonar -Dsonar.projectKey=$SONAR_PROJECT_KEY -Dsonar.host.url=$SONAR_HOST_URL -Dsonar.login=$SONAR_TOKEN
only:
- develop
- main
- /^release\/.*$/
tags:
- docker
# 4. 构建Docker镜像
build-docker-images:
stage: build-image
image: docker:20.10.16
services:
- docker:20.10.16-dind
variables:
BUILD_PROFILE: "development"
script:
- echo "Building Docker images for commit: $CI_COMMIT_SHORT_SHA"
# 登录到GitLab Container Registry
- docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
# 遍历所有服务目录并构建镜像
- |
for service_dir in */ ; do
if [ -f "$service_dir/pom.xml" ]; then
service_name=$(basename $service_dir)
echo "Building Docker image for $service_name..."
# 设置构建参数
export BUILD_PROFILE=${BUILD_PROFILE}
# 构建镜像
docker build \
--build-arg BUILD_PROFILE=${BUILD_PROFILE} \
-t $CI_REGISTRY_IMAGE/${service_name}:${IMAGE_TAG_COMMIT} \
-t $CI_REGISTRY_IMAGE/${service_name}:${IMAGE_TAG_BRANCH} \
-f $service_dir/Dockerfile \
$service_dir
fi
done
artifacts:
paths:
- "**/target/*.jar"
only:
- branches
tags:
- docker
# 5. 推送Docker镜像
push-docker-images:
stage: push-image
image: docker:20.10.16
services:
- docker:20.10.16-dind
dependencies:
- build-docker-images
script:
- docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
# 推送所有服务镜像
- |
for service_dir in */ ; do
if [ -f "$service_dir/pom.xml" ]; then
service_name=$(basename $service_dir)
echo "Pushing Docker image for $service_name..."
# 推送镜像
docker push $CI_REGISTRY_IMAGE/${service_name}:${IMAGE_TAG_COMMIT}
docker push $CI_REGISTRY_IMAGE/${service_name}:${IMAGE_TAG_BRANCH}
# 如果是main分支,推送latest标签
if [[ "$CI_COMMIT_REF_NAME" == "main" || "$CI_COMMIT_REF_NAME" == "master" ]]; then
docker tag $CI_REGISTRY_IMAGE/${service_name}:${IMAGE_TAG_COMMIT} $CI_REGISTRY_IMAGE/${service_name}:${IMAGE_TAG_LATEST}
docker push $CI_REGISTRY_IMAGE/${service_name}:${IMAGE_TAG_LATEST}
fi
fi
done
only:
- branches
tags:
- docker
# 6. 开发环境部署
deploy-development:
stage: deploy-development
image: alpine:latest
variables:
ENVIRONMENT: "development"
DEPLOY_SERVER: "dev-server.example.com"
DOCKER_REGISTRY: "$CI_REGISTRY"
before_script:
- apk add --no-cache openssh-client docker-cli curl
- mkdir -p ~/.ssh
- echo "$SSH_PRIVATE_KEY_DEV" > ~/.ssh/id_rsa
- chmod 600 ~/.ssh/id_rsa
- echo -e "Host *\n\tStrictHostKeyChecking no\n\n" > ~/.ssh/config
script:
- |
echo "Deploying to development environment..."
# SSH到部署服务器执行部署脚本
ssh -o ConnectTimeout=10 deploy@$DEPLOY_SERVER "
# 切换到部署目录
cd /data/deploy/development
# 更新部署脚本
git pull origin main
# 设置环境变量
export ENVIRONMENT=development
export DEPLOY_TAG=${IMAGE_TAG_BRANCH}
export DOCKER_REGISTRY=${DOCKER_REGISTRY}
export DOCKER_REGISTRY_USER=${CI_REGISTRY_USER}
export DOCKER_REGISTRY_PASSWORD=${CI_REGISTRY_PASSWORD}
# 执行部署
./deploy.sh --services=all --env=development --tag=${IMAGE_TAG_BRANCH}
"
# 等待部署完成并检查服务状态
sleep 30
ssh deploy@$DEPLOY_SERVER "cd /data/deploy/development && ./health-check.sh"
environment:
name: development
url: https://dev.example.com
only:
- /^feature\/.*$/
- develop
tags:
- docker
# 7. 测试环境部署
deploy-testing:
stage: deploy-testing
image: alpine:latest
variables:
ENVIRONMENT: "testing"
DEPLOY_SERVER: "test-server.example.com"
before_script:
- apk add --no-cache openssh-client docker-cli
- mkdir -p ~/.ssh
- echo "$SSH_PRIVATE_KEY_TEST" > ~/.ssh/id_rsa
- chmod 600 ~/.ssh/id_rsa
script:
- |
echo "Deploying to testing environment..."
# 重新构建测试环境镜像
for service_dir in */ ; do
if [ -f "$service_dir/pom.xml" ]; then
service_name=$(basename $service_dir)
docker build \
--build-arg BUILD_PROFILE=testing \
-t $CI_REGISTRY_IMAGE/${service_name}:testing-${IMAGE_TAG_COMMIT} \
-f $service_dir/Dockerfile \
$service_dir
docker push $CI_REGISTRY_IMAGE/${service_name}:testing-${IMAGE_TAG_COMMIT}
fi
done
# 部署到测试环境
ssh deploy@$DEPLOY_SERVER "
cd /data/deploy/testing
export ENVIRONMENT=testing
./deploy.sh --services=all --env=testing --tag=testing-${IMAGE_TAG_COMMIT}
"
environment:
name: testing
url: https://test.example.com
only:
- develop
when: manual
tags:
- docker
# 8. 预发布环境部署
deploy-staging:
stage: deploy-staging
image: alpine:latest
variables:
ENVIRONMENT: "staging"
before_script:
- apk add --no-cache openssh-client docker-cli
- mkdir -p ~/.ssh
- echo "$SSH_PRIVATE_KEY_STAGING" > ~/.ssh/id_rsa
- chmod 600 ~/.ssh/id_rsa
script:
- |
echo "Deploying to staging environment..."
# 构建预发布镜像
for service_dir in */ ; do
if [ -f "$service_dir/pom.xml" ]; then
service_name=$(basename $service_dir)
docker build \
--build-arg BUILD_PROFILE=staging \
-t $CI_REGISTRY_IMAGE/${service_name}:staging-${IMAGE_TAG_COMMIT} \
-f $service_dir/Dockerfile \
$service_dir
docker push $CI_REGISTRY_IMAGE/${service_name}:staging-${IMAGE_TAG_COMMIT}
fi
done
# 执行蓝绿部署
ssh deploy@staging-server "
cd /data/deploy/staging
./blue-green-deploy.sh --env=staging --tag=staging-${IMAGE_TAG_COMMIT}
"
environment:
name: staging
url: https://staging.example.com
only:
- /^release\/.*$/
when: manual
tags:
- docker
# 9. 生产环境部署
deploy-production:
stage: deploy-production
image: alpine:latest
variables:
ENVIRONMENT: "production"
before_script:
- apk add --no-cache openssh-client docker-cli
- mkdir -p ~/.ssh
- echo "$SSH_PRIVATE_KEY_PROD" > ~/.ssh/id_rsa
- chmod 600 ~/.ssh/id_rsa
script:
- |
echo "Starting production deployment..."
# 验证Docker镜像
for service_dir in */ ; do
if [ -f "$service_dir/pom.xml" ]; then
service_name=$(basename $service_dir)
echo "Verifying image: $CI_REGISTRY_IMAGE/${service_name}:${IMAGE_TAG_COMMIT}"
docker pull $CI_REGISTRY_IMAGE/${service_name}:${IMAGE_TAG_COMMIT}
fi
done
# 滚动更新生产环境
ssh deploy@prod-server-1 "
cd /data/deploy/production
./rolling-update.sh --env=production --tag=${IMAGE_TAG_COMMIT}
"
# 等待第一台服务器部署完成
sleep 60
# 部署到其他生产服务器
for server in prod-server-2 prod-server-3; do
ssh deploy@$server "
cd /data/deploy/production
./rolling-update.sh --env=production --tag=${IMAGE_TAG_COMMIT}
"
sleep 30
done
# 最终健康检查
echo "Performing final health check..."
curl -f https://api.example.com/actuator/health || exit 1
environment:
name: production
url: https://api.example.com
only:
- main
when: manual
tags:
- docker
# 10. 部署通知
deploy-notification:
stage: notify
image: appropriate/curl:latest
script:
- |
# 发送部署成功通知到钉钉/企业微信
if [ "$CI_JOB_STATUS" = "success" ]; then
curl -X POST $NOTIFICATION_WEBHOOK \
-H 'Content-Type: application/json' \
-d "{
\"msgtype\": \"markdown\",
\"markdown\": {
\"title\": \"部署成功通知\",
\"text\": \"### 🚀 部署成功\\n**项目**: $CI_PROJECT_NAME\\n**环境**: $ENVIRONMENT\\n**分支**: $CI_COMMIT_REF_NAME\\n**提交**: $CI_COMMIT_MESSAGE\\n**提交者**: $GITLAB_USER_NAME\\n**时间**: $(date)\"
}
}"
fi
when: on_success
tags:
- docker
五、部署脚本与配置文件
1. 部署目录结构
text
/data/deploy/
├── development/
│ ├── docker-compose.yml
│ ├── deploy.sh
│ ├── health-check.sh
│ └── .env.development
├── testing/
├── staging/
├── production/
└── scripts/
├── blue-green-deploy.sh
├── rolling-update.sh
└── rollback.sh
2. Docker Compose配置
docker-compose.yml (开发环境示例):
yaml
version: '3.8'
services:
# Nacos服务
nacos:
image: nacos/nacos-server:v2.2.3
container_name: nacos-dev
environment:
- MODE=standalone
- SPRING_PROFILES_ACTIVE=dev
- PREFER_HOST_MODE=hostname
ports:
- "8848:8848"
- "9848:9848"
- "9849:9849"
volumes:
- ./data/nacos/logs:/home/nacos/logs
- ./data/nacos/conf:/home/nacos/conf
networks:
- spring-cloud-net
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8848/nacos/v1/ns/service/list"]
interval: 30s
timeout: 10s
retries: 3
# Sentinel Dashboard
sentinel-dashboard:
image: bladex/sentinel-dashboard:1.8.6
container_name: sentinel-dashboard-dev
environment:
- SERVER_PORT=8080
- AUTH_USERNAME=sentinel
- AUTH_PASSWORD=sentinel
ports:
- "8080:8080"
networks:
- spring-cloud-net
# 用户服务
user-service:
image: ${REGISTRY}/user-service:${TAG}
container_name: user-service-dev
environment:
- SPRING_PROFILES_ACTIVE=${ENVIRONMENT}
- NACOS_SERVER=nacos:8848
- NACOS_NAMESPACE=${NACOS_NAMESPACE}
- SERVER_PORT=8081
ports:
- "8081:8081"
volumes:
- ./logs/user-service:/app/logs
networks:
- spring-cloud-net
depends_on:
nacos:
condition: service_healthy
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8081/actuator/health"]
interval: 30s
timeout: 10s
retries: 3
deploy:
replicas: 1
# 订单服务
order-service:
image: ${REGISTRY}/order-service:${TAG}
container_name: order-service-dev
environment:
- SPRING_PROFILES_ACTIVE=${ENVIRONMENT}
- NACOS_SERVER=nacos:8848
- NACOS_NAMESPACE=${NACOS_NAMESPACE}
- SERVER_PORT=8082
ports:
- "8082:8082"
networks:
- spring-cloud-net
depends_on:
- nacos
- user-service
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8082/actuator/health"]
interval: 30s
timeout: 10s
retries: 3
# API网关
api-gateway:
image: ${REGISTRY}/api-gateway:${TAG}
container_name: api-gateway-dev
environment:
- SPRING_PROFILES_ACTIVE=${ENVIRONMENT}
- NACOS_SERVER=nacos:8848
- NACOS_NAMESPACE=${NACOS_NAMESPACE}
- SERVER_PORT=8080
ports:
- "8080:8080"
- "9999:9999"
networks:
- spring-cloud-net
depends_on:
- nacos
- user-service
- order-service
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/actuator/health"]
interval: 30s
timeout: 10s
retries: 3
networks:
spring-cloud-net:
driver: bridge
3. 部署脚本
bash
#!/bin/bash
# 部署脚本
set -e
# 解析参数
while [[ $# -gt 0 ]]; do
case $1 in
--services)
SERVICES="$2"
shift 2
;;
--env)
ENVIRONMENT="$2"
shift 2
;;
--tag)
DEPLOY_TAG="$2"
shift 2
;;
*)
echo "Unknown option: $1"
exit 1
;;
esac
done
# 设置默认值
ENVIRONMENT=${ENVIRONMENT:-development}
DEPLOY_TAG=${DEPLOY_TAG:-latest}
SERVICES=${SERVICES:-all}
echo "========================================"
echo "Deploying Spring Cloud Alibaba Services"
echo "Environment: $ENVIRONMENT"
echo "Tag: $DEPLOY_TAG"
echo "Services: $SERVICES"
echo "========================================"
# 加载环境变量
if [ -f ".env.$ENVIRONMENT" ]; then
source ".env.$ENVIRONMENT"
fi
# 登录Docker Registry
if [ ! -z "$DOCKER_REGISTRY_USER" ] && [ ! -z "$DOCKER_REGISTRY_PASSWORD" ]; then
echo "Logging into Docker registry..."
echo "$DOCKER_REGISTRY_PASSWORD" | docker login -u "$DOCKER_REGISTRY_USER" --password-stdin "$DOCKER_REGISTRY"
fi
# 停止并删除旧容器
echo "Stopping old containers..."
docker-compose down --remove-orphans
# 拉取新镜像
echo "Pulling new images..."
if [ "$SERVICES" = "all" ]; then
docker-compose pull
else
for service in $(echo $SERVICES | tr ',' ' '); do
docker-compose pull $service
done
fi
# 启动服务
echo "Starting services..."
if [ "$SERVICES" = "all" ]; then
docker-compose up -d
else
for service in $(echo $SERVICES | tr ',' ' '); do
docker-compose up -d $service
done
fi
# 等待服务启动
echo "Waiting for services to be healthy..."
sleep 30
# 检查服务状态
echo "Checking service status..."
docker-compose ps
echo "========================================"
echo "Deployment completed!"
echo "========================================"
bash
#!/bin/bash
# 健康检查脚本
set -e
echo "Starting health check..."
# 定义需要检查的服务
services=("nacos" "user-service" "order-service" "api-gateway")
max_retries=30
retry_interval=10
for service in "${services[@]}"; do
echo "Checking health of $service..."
retries=0
while [ $retries -lt $max_retries ]; do
container_id=$(docker ps -qf "name=${service}")
if [ -z "$container_id" ]; then
echo "Container $service not found!"
exit 1
fi
health_status=$(docker inspect --format='{{.State.Health.Status}}' $container_id 2>/dev/null || echo "unknown")
if [ "$health_status" = "healthy" ]; then
echo "$service is healthy!"
break
elif [ "$health_status" = "unhealthy" ]; then
echo "$service is unhealthy!"
docker logs $container_id --tail 50
exit 1
fi
retries=$((retries + 1))
echo "Retry $retries/$max_retries for $service..."
sleep $retry_interval
done
if [ $retries -eq $max_retries ]; then
echo "$service health check timeout!"
exit 1
fi
done
# 检查Nacos服务注册
echo "Checking service registration in Nacos..."
curl -s http://localhost:8848/nacos/v1/ns/service/list | grep -q "user-service" && echo "user-service registered" || (echo "user-service not registered" && exit 1)
curl -s http://localhost:8848/nacos/v1/ns/service/list | grep -q "order-service" && echo "order-service registered" || (echo "order-service not registered" && exit 1)
echo "========================================"
echo "All services are healthy!"
echo "========================================"
rolling-update.sh (生产环境滚动更新):
bash
#!/bin/bash
# 滚动更新脚本
set -e
ENVIRONMENT=${ENVIRONMENT:-production}
DEPLOY_TAG=${DEPLOY_TAG:-latest}
echo "Starting rolling update for $ENVIRONMENT with tag $DEPLOY_TAG"
# 获取所有服务
services=$(docker-compose config --services)
for service in $services; do
echo "Updating $service..."
# 拉取新镜像
docker-compose pull $service
# 停止旧容器,启动新容器
docker-compose up -d --no-deps $service
# 等待服务健康
echo "Waiting for $service to be healthy..."
max_retries=30
retries=0
while [ $retries -lt $max_retries ]; do
container_id=$(docker ps -qf "name=${service}")
health_status=$(docker inspect --format='{{.State.Health.Status}}' $container_id 2>/dev/null || echo "unknown")
if [ "$health_status" = "healthy" ]; then
echo "$service update successful!"
break
fi
retries=$((retries + 1))
sleep 10
done
if [ $retries -eq $max_retries ]; then
echo "$service update failed!"
exit 1
fi
# 清理旧镜像
echo "Cleaning up old images for $service..."
docker image prune -f
done
echo "Rolling update completed successfully!"
六、环境变量配置
.env.development:
bash
# 开发环境配置
ENVIRONMENT=development
REGISTRY=registry.gitlab.com/your-group
TAG=latest
NACOS_NAMESPACE=dev-namespace
DOCKER_REGISTRY_USER=gitlab-ci-token
DOCKER_REGISTRY_PASSWORD=${CI_JOB_TOKEN}
.env.production:
bash
# 生产环境配置
ENVIRONMENT=production
REGISTRY=registry.gitlab.com/your-group
TAG=latest
NACOS_NAMESPACE=prod-namespace
NACOS_USERNAME=nacos
NACOS_PASSWORD=${NACOS_PROD_PASSWORD}
SENTINEL_DASHBOARD=sentinel-dashboard:8080
SEATA_SERVER=seata-server:8091
# Java性能优化
JAVA_OPTS=-Xmx2g -Xms2g -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -XX:+ParallelRefProcEnabled -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/app/logs
七、监控与日志配置
1. Docker Compose监控扩展
yaml
# 监控服务配置
prometheus:
image: prom/prometheus:latest
container_name: prometheus
volumes:
- ./prometheus/prometheus.yml:/etc/prometheus/prometheus.yml
- prometheus_data:/prometheus
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
- '--web.console.libraries=/etc/prometheus/console_libraries'
- '--web.console.templates=/etc/prometheus/consoles'
- '--storage.tsdb.retention.time=200h'
- '--web.enable-lifecycle'
ports:
- "9090:9090"
networks:
- spring-cloud-net
grafana:
image: grafana/grafana:latest
container_name: grafana
environment:
- GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_PASSWORD}
volumes:
- grafana_data:/var/lib/grafana
- ./grafana/provisioning:/etc/grafana/provisioning
ports:
- "3000:3000"
networks:
- spring-cloud-net
depends_on:
- prometheus
loki:
image: grafana/loki:latest
container_name: loki
ports:
- "3100:3100"
command: -config.file=/etc/loki/local-config.yaml
volumes:
- ./loki/loki-config.yaml:/etc/loki/local-config.yaml
- loki_data:/loki
networks:
- spring-cloud-net
promtail:
image: grafana/promtail:latest
container_name: promtail
volumes:
- /var/log:/var/log:ro
- /var/lib/docker/containers:/var/lib/docker/containers:ro
- ./promtail/promtail-config.yaml:/etc/promtail/config.yaml
command: -config.file=/etc/promtail/config.yaml
networks:
- spring-cloud-net
2. 日志收集配置
yaml
# Spring Boot日志配置
logging:
file:
path: /app/logs
name: /app/logs/app.log
logback:
rollingpolicy:
max-file-size: 10MB
max-history: 30
pattern:
console: "%d{yyyy-MM-dd HH:mm:ss} [%thread] %-5level %logger{36} - %msg%n"
file: "%d{yyyy-MM-dd HH:mm:ss} [%thread] %-5level %logger{36} - %msg%n"
八、安全与最佳实践
1. Docker安全配置
bash
# 创建非root用户运行容器
RUN addgroup -g 1000 spring && \
adduser -u 1000 -G spring -D spring
USER spring
# 使用Docker Content Trust
export DOCKER_CONTENT_TRUST=1
# 扫描镜像漏洞
docker scan $IMAGE_NAME
2. GitLab CI/CD安全
-
使用Protected Branches保护main/develop分支
-
设置Merge Request Approval Rules
-
使用环境保护规则
-
敏感信息存储在GitLab Variables中
-
定期轮换SSH密钥和访问令牌
3. 数据库迁移脚本
bash
#!/bin/bash
# 数据库迁移脚本
flyway migrate -configFiles=flyway.conf -locations=filesystem:sql/migrations
九、故障排查与恢复
1. 快速回滚
bash
# 回滚到上一个版本
./rollback.sh --env=production --tag=previous-tag
# 查看容器日志
docker logs -f container_name --tail 100
# 进入容器调试
docker exec -it container_name /bin/sh
2. 监控关键指标
-
容器CPU/内存使用率
-
应用响应时间
-
错误率
-
Nacos服务实例数
-
Sentinel流量控制
-
数据库连接池
这个完整的Spring Cloud Alibaba容器化部署方案提供了:
-
✅ 完整的CI/CD流水线
-
✅ 多环境隔离部署
-
✅ 微服务架构支持
-
✅ 容器化部署与编排
-
✅ 监控与日志收集
-
✅ 安全最佳实践
-
✅ 滚动更新与回滚机制
-
✅ 健康检查与自愈能力