第14章:生产环境部署与 DevOps 实践
前言
大家好,我是鲫小鱼。是一名不写前端代码
的前端工程师,热衷于分享非前端的知识,带领切图仔逃离切图圈子,欢迎关注我,微信公众号:《鲫小鱼不正经》
。欢迎点赞、收藏、关注,一键三连!!
🎯 本章学习目标
- 掌握 AI 应用生产环境部署的完整流程:从开发到上线的全链路实践
- 深入理解容器化技术(Docker)在 AI 应用中的应用与优化策略
- 构建完整的 CI/CD 流水线:自动化测试、构建、部署与回滚机制
- 建立全面的监控体系:性能监控、日志管理、告警系统与故障排查
- 掌握云原生部署策略:Kubernetes、服务网格、负载均衡与弹性伸缩
- 实现安全防护体系:API 安全、数据加密、访问控制与合规性管理
- 完成企业级 AI 应用的完整部署实战:从零到生产环境的全流程
📖 理论基础:生产环境部署的核心概念(约 30%)
14.1 为什么 AI 应用部署如此复杂
AI 应用与传统 Web 应用相比,在生产环境部署上面临着独特的挑战:
🔍 资源密集型
- 模型文件通常很大(几GB到几十GB),需要高效的存储和传输策略
- 推理计算需要大量 CPU/GPU 资源,对硬件配置要求高
- 内存消耗大,需要精细的内存管理和优化
⚡ 延迟敏感
- 用户对 AI 响应时间期望高,需要毫秒级响应
- 模型加载时间长,需要预热和缓存策略
- 网络延迟对用户体验影响显著
🔄 版本管理复杂
- 模型版本、代码版本、数据版本需要协调管理
- A/B 测试和灰度发布策略更加复杂
- 回滚机制需要考虑模型兼容性
📊 监控需求特殊
- 需要监控模型性能指标(准确率、召回率等)
- Token 使用量、API 调用成本需要实时监控
- 模型漂移和性能衰减需要及时发现
14.2 现代部署架构演进
传统单体部署 → 微服务架构 → 云原生架构
scss
┌─────────────────────────────────────────────────────────────┐
│ 云原生 AI 应用架构 │
├─────────────────────────────────────────────────────────────┤
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ 前端应用 │ │ API 网关 │ │ 负载均衡 │ │
│ │ (Next.js) │ │ (Kong) │ │ (Nginx) │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
├─────────────────────────────────────────────────────────────┤
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ AI 服务 │ │ 向量数据库 │ │ 缓存系统 │ │
│ │ (LangChain)│ │ (Pinecone) │ │ (Redis) │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
├─────────────────────────────────────────────────────────────┤
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ 监控系统 │ │ 日志系统 │ │ 配置管理 │ │
│ │ (Prometheus)│ │ (ELK) │ │ (Consul) │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
└─────────────────────────────────────────────────────────────┘
14.3 DevOps 在 AI 应用中的特殊实践
🔄 MLOps vs DevOps
- MLOps 是 DevOps 在机器学习领域的扩展
- 增加了数据管道、模型训练、模型部署等环节
- 需要处理模型版本管理、实验跟踪、模型监控等特殊需求
📈 持续集成/持续部署 (CI/CD)
- 代码质量检查:静态分析、单元测试、集成测试
- 模型验证:准确性测试、性能基准测试、A/B 测试
- 自动化部署:蓝绿部署、金丝雀发布、滚动更新
🔍 监控与可观测性
- 应用性能监控 (APM):响应时间、吞吐量、错误率
- 基础设施监控:CPU、内存、磁盘、网络使用率
- 业务指标监控:用户满意度、转化率、成本效益
14.4 部署策略与模式
🚀 部署策略
- 蓝绿部署:维护两套完全相同的生产环境,实现零停机部署
- 金丝雀发布:逐步将流量从旧版本切换到新版本
- 滚动更新:逐步替换实例,保持服务可用性
- A/B 测试:同时运行多个版本,比较效果
🏗️ 架构模式
- 微服务架构:将 AI 应用拆分为多个独立的服务
- 服务网格:统一管理服务间通信、安全、监控
- 事件驱动架构:通过消息队列实现异步处理
- CQRS 模式:读写分离,优化查询性能
🐳 容器化技术:Docker 在 AI 应用中的应用(约 20%)
14.5 Docker 基础与最佳实践
📦 多阶段构建优化
dockerfile
# 文件:Dockerfile
# 第一阶段:构建环境
FROM node:18-alpine AS builder
WORKDIR /app
# 复制 package 文件
COPY package*.json ./
COPY tsconfig.json ./
# 安装依赖
RUN npm ci --only=production
# 复制源代码
COPY src/ ./src/
# 构建应用
RUN npm run build
# 第二阶段:生产环境
FROM node:18-alpine AS production
# 安装必要的系统依赖
RUN apk add --no-cache \
python3 \
make \
g++ \
&& rm -rf /var/cache/apk/*
WORKDIR /app
# 创建非 root 用户
RUN addgroup -g 1001 -S nodejs
RUN adduser -S nextjs -u 1001
# 复制构建产物
COPY --from=builder --chown=nextjs:nodejs /app/dist ./dist
COPY --from=builder --chown=nextjs:nodejs /app/node_modules ./node_modules
COPY --from=builder --chown=nextjs:nodejs /app/package*.json ./
# 设置环境变量
ENV NODE_ENV=production
ENV PORT=3000
# 切换到非 root 用户
USER nextjs
# 暴露端口
EXPOSE 3000
# 健康检查
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
CMD curl -f http://localhost:3000/api/health || exit 1
# 启动应用
CMD ["node", "dist/index.js"]
🔧 环境配置管理
dockerfile
# 文件:Dockerfile.dev
FROM node:18-alpine AS development
WORKDIR /app
# 安装开发依赖
COPY package*.json ./
RUN npm install
# 复制源代码
COPY . .
# 设置开发环境变量
ENV NODE_ENV=development
ENV LANGCHAIN_TRACING_V2=true
# 暴露端口
EXPOSE 3000
# 启动开发服务器
CMD ["npm", "run", "dev"]
14.6 Docker Compose 多服务编排
yaml
# 文件:docker-compose.yml
version: '3.8'
services:
# AI 应用服务
ai-app:
build:
context: .
dockerfile: Dockerfile
ports:
- "3000:3000"
environment:
- NODE_ENV=production
- OPENAI_API_KEY=${OPENAI_API_KEY}
- REDIS_URL=redis://redis:6379
- DATABASE_URL=postgresql://postgres:password@postgres:5432/ai_app
depends_on:
- redis
- postgres
volumes:
- ./logs:/app/logs
restart: unless-stopped
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:3000/api/health"]
interval: 30s
timeout: 10s
retries: 3
# Redis 缓存服务
redis:
image: redis:7-alpine
ports:
- "6379:6379"
volumes:
- redis_data:/data
restart: unless-stopped
command: redis-server --appendonly yes
# PostgreSQL 数据库
postgres:
image: postgres:15-alpine
environment:
- POSTGRES_DB=ai_app
- POSTGRES_USER=postgres
- POSTGRES_PASSWORD=password
ports:
- "5432:5432"
volumes:
- postgres_data:/var/lib/postgresql/data
restart: unless-stopped
# Nginx 反向代理
nginx:
image: nginx:alpine
ports:
- "80:80"
- "443:443"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf
- ./ssl:/etc/nginx/ssl
depends_on:
- ai-app
restart: unless-stopped
# Prometheus 监控
prometheus:
image: prom/prometheus:latest
ports:
- "9090:9090"
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- prometheus_data:/prometheus
restart: unless-stopped
# Grafana 可视化
grafana:
image: grafana/grafana:latest
ports:
- "3001:3000"
environment:
- GF_SECURITY_ADMIN_PASSWORD=admin
volumes:
- grafana_data:/var/lib/grafana
restart: unless-stopped
volumes:
redis_data:
postgres_data:
prometheus_data:
grafana_data:
14.7 镜像优化与安全加固
🔒 安全最佳实践
dockerfile
# 文件:Dockerfile.secure
FROM node:18-alpine AS base
# 安装安全更新
RUN apk update && apk upgrade && apk add --no-cache dumb-init
# 创建应用用户
RUN addgroup -g 1001 -S nodejs && \
adduser -S nextjs -u 1001 -G nodejs
# 设置工作目录
WORKDIR /app
# 复制 package 文件并安装依赖
COPY package*.json ./
RUN npm ci --only=production && npm cache clean --force
# 复制应用代码
COPY --chown=nextjs:nodejs . .
# 构建应用
RUN npm run build
# 切换到非 root 用户
USER nextjs
# 设置安全环境变量
ENV NODE_ENV=production
ENV NODE_OPTIONS="--max-old-space-size=512"
# 使用 dumb-init 作为 PID 1
ENTRYPOINT ["dumb-init", "--"]
# 启动应用
CMD ["node", "dist/index.js"]
📊 镜像大小优化
bash
# 使用 .dockerignore 排除不必要的文件
# 文件:.dockerignore
node_modules
npm-debug.log
.git
.gitignore
README.md
.env
.nyc_output
coverage
.vscode
.idea
*.log
dist
build
.next
🚀 CI/CD 流水线:自动化部署实践(约 25%)
14.8 GitHub Actions 完整流水线
yaml
# 文件:.github/workflows/deploy.yml
name: AI App CI/CD Pipeline
on:
push:
branches: [main, develop]
pull_request:
branches: [main]
env:
REGISTRY: ghcr.io
IMAGE_NAME: ${{ github.repository }}
jobs:
# 代码质量检查
quality-check:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: '18'
cache: 'npm'
- name: Install dependencies
run: npm ci
- name: Run linting
run: npm run lint
- name: Run type checking
run: npm run type-check
- name: Run unit tests
run: npm run test:unit
env:
NODE_ENV: test
- name: Run integration tests
run: npm run test:integration
env:
NODE_ENV: test
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY_TEST }}
- name: Generate coverage report
run: npm run test:coverage
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v3
with:
file: ./coverage/lcov.info
# 安全扫描
security-scan:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Run Trivy vulnerability scanner
uses: aquasecurity/trivy-action@master
with:
scan-type: 'fs'
scan-ref: '.'
format: 'sarif'
output: 'trivy-results.sarif'
- name: Upload Trivy scan results
uses: github/codeql-action/upload-sarif@v2
with:
sarif_file: 'trivy-results.sarif'
# 构建和推送镜像
build-and-push:
needs: [quality-check, security-scan]
runs-on: ubuntu-latest
if: github.ref == 'refs/heads/main'
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Log in to Container Registry
uses: docker/login-action@v3
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Extract metadata
id: meta
uses: docker/metadata-action@v5
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
tags: |
type=ref,event=branch
type=ref,event=pr
type=sha,prefix={{branch}}-
type=raw,value=latest,enable={{is_default_branch}}
- name: Build and push Docker image
uses: docker/build-push-action@v5
with:
context: .
platforms: linux/amd64,linux/arm64
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
cache-from: type=gha
cache-to: type=gha,mode=max
# 部署到生产环境
deploy:
needs: build-and-push
runs-on: ubuntu-latest
if: github.ref == 'refs/heads/main'
environment: production
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: us-west-2
- name: Update ECS service
run: |
aws ecs update-service \
--cluster ai-app-cluster \
--service ai-app-service \
--force-new-deployment
- name: Wait for deployment to complete
run: |
aws ecs wait services-stable \
--cluster ai-app-cluster \
--services ai-app-service
- name: Run smoke tests
run: |
# 等待服务启动
sleep 30
# 运行健康检查
curl -f https://api.yourapp.com/health || exit 1
# 运行基本功能测试
npm run test:smoke
env:
API_BASE_URL: https://api.yourapp.com
14.9 环境配置管理
typescript
// 文件:src/config/environment.ts
import { z } from 'zod';
// 环境变量验证 Schema
const envSchema = z.object({
NODE_ENV: z.enum(['development', 'test', 'production']).default('development'),
PORT: z.string().transform(Number).default(3000),
// API 配置
OPENAI_API_KEY: z.string().min(1),
OPENAI_BASE_URL: z.string().url().optional(),
// 数据库配置
DATABASE_URL: z.string().url(),
REDIS_URL: z.string().url(),
// 监控配置
LANGCHAIN_TRACING_V2: z.string().transform(val => val === 'true').default(false),
LANGCHAIN_API_KEY: z.string().optional(),
// 安全配置
JWT_SECRET: z.string().min(32),
CORS_ORIGIN: z.string().default('*'),
// 功能开关
ENABLE_RATE_LIMITING: z.string().transform(val => val === 'true').default(true),
ENABLE_CACHING: z.string().transform(val => val === 'true').default(true),
});
// 验证并导出配置
export const config = envSchema.parse(process.env);
// 类型导出
export type Config = z.infer<typeof envSchema>;
// 环境特定配置
export const getEnvironmentConfig = () => {
switch (config.NODE_ENV) {
case 'development':
return {
logLevel: 'debug',
enableSwagger: true,
enableMetrics: false,
};
case 'test':
return {
logLevel: 'error',
enableSwagger: false,
enableMetrics: false,
};
case 'production':
return {
logLevel: 'info',
enableSwagger: false,
enableMetrics: true,
};
default:
throw new Error(`Unknown environment: ${config.NODE_ENV}`);
}
};
14.10 自动化测试策略
typescript
// 文件:src/tests/integration/ai-service.test.ts
import { describe, it, expect, beforeAll, afterAll } from '@jest/globals';
import { ChatOpenAI } from '@langchain/openai';
import { AIAssistant } from '../../services/ai-assistant';
describe('AI Service Integration Tests', () => {
let aiAssistant: AIAssistant;
beforeAll(async () => {
// 使用测试环境的 API Key
const model = new ChatOpenAI({
openAIApiKey: process.env.OPENAI_API_KEY_TEST,
modelName: 'gpt-3.5-turbo',
temperature: 0,
});
aiAssistant = new AIAssistant(model);
});
afterAll(async () => {
// 清理资源
});
describe('Basic Functionality', () => {
it('should respond to simple questions', async () => {
const response = await aiAssistant.ask('What is 2+2?');
expect(response).toBeDefined();
expect(response.content).toContain('4');
expect(response.usage).toBeDefined();
});
it('should handle empty input gracefully', async () => {
await expect(aiAssistant.ask('')).rejects.toThrow('Question cannot be empty');
});
it('should respect rate limits', async () => {
const promises = Array(10).fill(null).map(() =>
aiAssistant.ask('Test question')
);
const responses = await Promise.allSettled(promises);
const rejected = responses.filter(r => r.status === 'rejected');
// 应该有一些请求被限流
expect(rejected.length).toBeGreaterThan(0);
});
});
describe('Performance Tests', () => {
it('should respond within acceptable time limits', async () => {
const startTime = Date.now();
await aiAssistant.ask('What is the capital of France?');
const endTime = Date.now();
const responseTime = endTime - startTime;
expect(responseTime).toBeLessThan(5000); // 5秒内响应
});
});
describe('Error Handling', () => {
it('should handle API errors gracefully', async () => {
// 使用无效的 API Key 测试错误处理
const invalidModel = new ChatOpenAI({
openAIApiKey: 'invalid-key',
modelName: 'gpt-3.5-turbo',
});
const invalidAssistant = new AIAssistant(invalidModel);
await expect(invalidAssistant.ask('Test')).rejects.toThrow();
});
});
});
📊 监控与可观测性:全面监控体系(约 15%)
14.11 应用性能监控 (APM)
typescript
// 文件:src/monitoring/apm.ts
import { createPrometheusMetrics } from 'prom-client';
import { Request, Response, NextFunction } from 'express';
// 创建 Prometheus 指标
const httpRequestDuration = new createPrometheusMetrics.Histogram({
name: 'http_request_duration_seconds',
help: 'Duration of HTTP requests in seconds',
labelNames: ['method', 'route', 'status_code'],
buckets: [0.1, 0.3, 0.5, 0.7, 1, 3, 5, 7, 10],
});
const httpRequestTotal = new createPrometheusMetrics.Counter({
name: 'http_requests_total',
help: 'Total number of HTTP requests',
labelNames: ['method', 'route', 'status_code'],
});
const aiRequestDuration = new createPrometheusMetrics.Histogram({
name: 'ai_request_duration_seconds',
help: 'Duration of AI API requests in seconds',
labelNames: ['model', 'operation'],
buckets: [0.5, 1, 2, 5, 10, 30, 60],
});
const aiTokenUsage = new createPrometheusMetrics.Counter({
name: 'ai_tokens_used_total',
help: 'Total number of tokens used',
labelNames: ['model', 'type'], // type: prompt, completion, total
});
const aiRequestErrors = new createPrometheusMetrics.Counter({
name: 'ai_request_errors_total',
help: 'Total number of AI request errors',
labelNames: ['model', 'error_type'],
});
// 中间件:HTTP 请求监控
export const httpMetricsMiddleware = (req: Request, res: Response, next: NextFunction) => {
const startTime = Date.now();
res.on('finish', () => {
const duration = (Date.now() - startTime) / 1000;
const route = req.route?.path || req.path;
httpRequestDuration
.labels(req.method, route, res.statusCode.toString())
.observe(duration);
httpRequestTotal
.labels(req.method, route, res.statusCode.toString())
.inc();
});
next();
};
// AI 请求监控装饰器
export const monitorAIRequest = (model: string, operation: string) => {
return function (target: any, propertyName: string, descriptor: PropertyDescriptor) {
const method = descriptor.value;
descriptor.value = async function (...args: any[]) {
const startTime = Date.now();
try {
const result = await method.apply(this, args);
const duration = (Date.now() - startTime) / 1000;
aiRequestDuration
.labels(model, operation)
.observe(duration);
// 记录 token 使用量
if (result.usage) {
aiTokenUsage
.labels(model, 'prompt')
.inc(result.usage.promptTokens || 0);
aiTokenUsage
.labels(model, 'completion')
.inc(result.usage.completionTokens || 0);
aiTokenUsage
.labels(model, 'total')
.inc(result.usage.totalTokens || 0);
}
return result;
} catch (error: any) {
aiRequestErrors
.labels(model, error.name || 'Unknown')
.inc();
throw error;
}
};
};
};
// 健康检查端点
export const healthCheck = (req: Request, res: Response) => {
const health = {
status: 'healthy',
timestamp: new Date().toISOString(),
uptime: process.uptime(),
memory: process.memoryUsage(),
version: process.env.npm_package_version || 'unknown',
};
res.json(health);
};
// 指标端点
export const metricsEndpoint = (req: Request, res: Response) => {
res.set('Content-Type', createPrometheusMetrics.register.contentType);
res.end(createPrometheusMetrics.register.metrics());
};
14.12 日志管理系统
typescript
// 文件:src/logging/logger.ts
import winston from 'winston';
import { Request, Response, NextFunction } from 'express';
// 创建 Winston 日志器
const logger = winston.createLogger({
level: process.env.LOG_LEVEL || 'info',
format: winston.format.combine(
winston.format.timestamp(),
winston.format.errors({ stack: true }),
winston.format.json()
),
defaultMeta: { service: 'ai-app' },
transports: [
// 控制台输出
new winston.transports.Console({
format: winston.format.combine(
winston.format.colorize(),
winston.format.simple()
),
}),
// 文件输出
new winston.transports.File({
filename: 'logs/error.log',
level: 'error',
maxsize: 5242880, // 5MB
maxFiles: 5,
}),
new winston.transports.File({
filename: 'logs/combined.log',
maxsize: 5242880, // 5MB
maxFiles: 5,
}),
],
});
// 请求日志中间件
export const requestLogger = (req: Request, res: Response, next: NextFunction) => {
const startTime = Date.now();
// 记录请求开始
logger.info('Request started', {
method: req.method,
url: req.url,
userAgent: req.get('User-Agent'),
ip: req.ip,
requestId: req.headers['x-request-id'],
});
res.on('finish', () => {
const duration = Date.now() - startTime;
logger.info('Request completed', {
method: req.method,
url: req.url,
statusCode: res.statusCode,
duration,
requestId: req.headers['x-request-id'],
});
});
next();
};
// AI 请求日志
export const logAIRequest = (model: string, prompt: string, response: any, duration: number) => {
logger.info('AI request completed', {
model,
promptLength: prompt.length,
responseLength: response.content?.length || 0,
duration,
usage: response.usage,
requestId: response.requestId,
});
};
// 错误日志
export const logError = (error: Error, context?: any) => {
logger.error('Application error', {
message: error.message,
stack: error.stack,
context,
});
};
export default logger;
14.13 告警系统配置
yaml
# 文件:prometheus/alerts.yml
groups:
- name: ai-app-alerts
rules:
# 高错误率告警
- alert: HighErrorRate
expr: rate(http_requests_total{status_code=~"5.."}[5m]) > 0.1
for: 2m
labels:
severity: critical
annotations:
summary: "High error rate detected"
description: "Error rate is {{ $value }} errors per second"
# 响应时间过长告警
- alert: HighResponseTime
expr: histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m])) > 2
for: 5m
labels:
severity: warning
annotations:
summary: "High response time detected"
description: "95th percentile response time is {{ $value }} seconds"
# AI 请求失败告警
- alert: AIRequestFailures
expr: rate(ai_request_errors_total[5m]) > 0.05
for: 1m
labels:
severity: critical
annotations:
summary: "AI request failures detected"
description: "AI request failure rate is {{ $value }} failures per second"
# Token 使用量异常告警
- alert: HighTokenUsage
expr: rate(ai_tokens_used_total[1h]) > 10000
for: 10m
labels:
severity: warning
annotations:
summary: "High token usage detected"
description: "Token usage rate is {{ $value }} tokens per second"
# 内存使用率告警
- alert: HighMemoryUsage
expr: (node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes > 0.9
for: 5m
labels:
severity: warning
annotations:
summary: "High memory usage detected"
description: "Memory usage is {{ $value | humanizePercentage }}"
# 磁盘空间告警
- alert: LowDiskSpace
expr: (node_filesystem_avail_bytes / node_filesystem_size_bytes) < 0.1
for: 5m
labels:
severity: critical
annotations:
summary: "Low disk space"
description: "Disk space is {{ $value | humanizePercentage }} available"
🚀 实战项目:企业级 AI 应用完整部署(约 30%)
14.14 项目架构设计
让我们构建一个完整的企业级 AI 知识问答系统,包含以下组件:
scss
┌─────────────────────────────────────────────────────────────┐
│ 企业 AI 知识问答系统 │
├─────────────────────────────────────────────────────────────┤
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ 前端应用 │ │ API 网关 │ │ 负载均衡 │ │
│ │ (Next.js) │ │ (Kong) │ │ (Nginx) │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
├─────────────────────────────────────────────────────────────┤
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ AI 服务 │ │ 向量数据库 │ │ 缓存系统 │ │
│ │ (LangChain)│ │ (Pinecone) │ │ (Redis) │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
├─────────────────────────────────────────────────────────────┤
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ 文档处理 │ │ 用户管理 │ │ 审计日志 │ │
│ │ (Worker) │ │ (Auth) │ │ (Logging) │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
└─────────────────────────────────────────────────────────────┘
14.15 核心服务实现
typescript
// 文件:src/services/ai-knowledge-service.ts
import { ChatOpenAI } from '@langchain/openai';
import { PineconeStore } from '@langchain/pinecone';
import { PromptTemplate } from '@langchain/core/prompts';
import { StringOutputParser } from '@langchain/core/output_parsers';
import { RunnableSequence } from '@langchain/core/runnables';
import { monitorAIRequest } from '../monitoring/apm';
import { logAIRequest } from '../logging/logger';
export class AIKnowledgeService {
private model: ChatOpenAI;
private vectorStore: PineconeStore;
private promptTemplate: PromptTemplate;
constructor() {
this.model = new ChatOpenAI({
openAIApiKey: process.env.OPENAI_API_KEY,
modelName: 'gpt-4',
temperature: 0.1,
});
this.promptTemplate = PromptTemplate.fromTemplate(`
你是一个专业的企业知识助手。请基于以下上下文信息回答用户问题。
上下文信息:
{context}
用户问题:{question}
请提供准确、详细的回答,并标明信息来源。如果上下文信息不足以回答问题,请明确说明。
回答:
`);
}
@monitorAIRequest('gpt-4', 'knowledge-query')
async queryKnowledge(question: string, userId: string): Promise<{
answer: string;
sources: string[];
confidence: number;
}> {
const startTime = Date.now();
try {
// 1. 向量检索
const relevantDocs = await this.vectorStore.similaritySearch(question, 5);
if (relevantDocs.length === 0) {
return {
answer: '抱歉,我没有找到相关的信息来回答您的问题。',
sources: [],
confidence: 0,
};
}
// 2. 构建上下文
const context = relevantDocs
.map((doc, index) => `[${index + 1}] ${doc.pageContent}`)
.join('\n\n');
// 3. 生成回答
const chain = RunnableSequence.from([
this.promptTemplate,
this.model,
new StringOutputParser(),
]);
const answer = await chain.invoke({
context,
question,
});
// 4. 计算置信度
const confidence = this.calculateConfidence(relevantDocs, question);
// 5. 提取来源
const sources = relevantDocs.map(doc => doc.metadata?.source || 'Unknown');
const duration = Date.now() - startTime;
logAIRequest('gpt-4', question, { content: answer }, duration);
return {
answer,
sources,
confidence,
};
} catch (error) {
throw new Error(`Knowledge query failed: ${error.message}`);
}
}
private calculateConfidence(docs: any[], question: string): number {
// 简单的置信度计算逻辑
// 实际应用中可以使用更复杂的算法
const avgScore = docs.reduce((sum, doc) => sum + (doc.metadata?.score || 0), 0) / docs.length;
return Math.min(avgScore * 2, 1); // 归一化到 0-1
}
}
14.16 API 路由实现
typescript
// 文件:src/routes/api/knowledge.ts
import { Router } from 'express';
import { AIKnowledgeService } from '../../services/ai-knowledge-service';
import { authenticateToken } from '../../middleware/auth';
import { rateLimiter } from '../../middleware/rate-limiter';
import { validateRequest } from '../../middleware/validation';
import { z } from 'zod';
const router = Router();
const aiService = new AIKnowledgeService();
// 请求验证 Schema
const querySchema = z.object({
question: z.string().min(1).max(1000),
context: z.string().optional(),
});
// 知识查询端点
router.post('/query',
authenticateToken,
rateLimiter,
validateRequest(querySchema),
async (req, res) => {
try {
const { question, context } = req.body;
const userId = req.user.id;
const result = await aiService.queryKnowledge(question, userId);
res.json({
success: true,
data: result,
timestamp: new Date().toISOString(),
});
} catch (error) {
res.status(500).json({
success: false,
error: error.message,
timestamp: new Date().toISOString(),
});
}
}
);
// 批量查询端点
router.post('/batch-query',
authenticateToken,
rateLimiter,
async (req, res) => {
try {
const { questions } = req.body;
const userId = req.user.id;
if (!Array.isArray(questions) || questions.length > 10) {
return res.status(400).json({
success: false,
error: 'Invalid questions array (max 10 items)',
});
}
const results = await Promise.all(
questions.map(question => aiService.queryKnowledge(question, userId))
);
res.json({
success: true,
data: results,
timestamp: new Date().toISOString(),
});
} catch (error) {
res.status(500).json({
success: false,
error: error.message,
timestamp: new Date().toISOString(),
});
}
}
);
export default router;
14.17 Kubernetes 部署配置
yaml
# 文件:k8s/namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
name: ai-app
labels:
name: ai-app
---
# 文件:k8s/configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: ai-app-config
namespace: ai-app
data:
NODE_ENV: "production"
LOG_LEVEL: "info"
ENABLE_METRICS: "true"
ENABLE_CACHING: "true"
---
# 文件:k8s/secret.yaml
apiVersion: v1
kind: Secret
metadata:
name: ai-app-secrets
namespace: ai-app
type: Opaque
data:
OPENAI_API_KEY: <base64-encoded-key>
DATABASE_URL: <base64-encoded-url>
JWT_SECRET: <base64-encoded-secret>
---
# 文件:k8s/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: ai-app
namespace: ai-app
labels:
app: ai-app
spec:
replicas: 3
selector:
matchLabels:
app: ai-app
template:
metadata:
labels:
app: ai-app
spec:
containers:
- name: ai-app
image: ghcr.io/your-org/ai-app:latest
ports:
- containerPort: 3000
env:
- name: NODE_ENV
valueFrom:
configMapKeyRef:
name: ai-app-config
key: NODE_ENV
- name: OPENAI_API_KEY
valueFrom:
secretKeyRef:
name: ai-app-secrets
key: OPENAI_API_KEY
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "500m"
livenessProbe:
httpGet:
path: /api/health
port: 3000
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /api/health
port: 3000
initialDelaySeconds: 5
periodSeconds: 5
---
# 文件:k8s/service.yaml
apiVersion: v1
kind: Service
metadata:
name: ai-app-service
namespace: ai-app
spec:
selector:
app: ai-app
ports:
- protocol: TCP
port: 80
targetPort: 3000
type: ClusterIP
---
# 文件:k8s/ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: ai-app-ingress
namespace: ai-app
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /
nginx.ingress.kubernetes.io/ssl-redirect: "true"
cert-manager.io/cluster-issuer: "letsencrypt-prod"
spec:
tls:
- hosts:
- api.yourapp.com
secretName: ai-app-tls
rules:
- host: api.yourapp.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: ai-app-service
port:
number: 80
---
# 文件:k8s/hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: ai-app-hpa
namespace: ai-app
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: ai-app
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
14.18 服务网格配置 (Istio)
yaml
# 文件:k8s/istio-gateway.yaml
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
name: ai-app-gateway
namespace: ai-app
spec:
selector:
istio: ingressgateway
servers:
- port:
number: 443
name: https
protocol: HTTPS
tls:
mode: SIMPLE
credentialName: ai-app-tls
hosts:
- api.yourapp.com
---
# 文件:k8s/istio-virtualservice.yaml
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: ai-app-vs
namespace: ai-app
spec:
hosts:
- api.yourapp.com
gateways:
- ai-app-gateway
http:
- match:
- uri:
prefix: /api/
route:
- destination:
host: ai-app-service
port:
number: 80
timeout: 30s
retries:
attempts: 3
perTryTimeout: 10s
---
# 文件:k8s/istio-destinationrule.yaml
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
name: ai-app-dr
namespace: ai-app
spec:
host: ai-app-service
trafficPolicy:
loadBalancer:
simple: LEAST_CONN
connectionPool:
tcp:
maxConnections: 100
http:
http1MaxPendingRequests: 50
maxRequestsPerConnection: 10
circuitBreaker:
consecutiveErrors: 5
interval: 30s
baseEjectionTime: 30s
🔒 安全防护体系:企业级安全实践(约 10%)
14.19 API 安全防护
typescript
// 文件:src/security/api-security.ts
import { Request, Response, NextFunction } from 'express';
import rateLimit from 'express-rate-limit';
import helmet from 'helmet';
import { body, validationResult } from 'express-validator';
// 速率限制配置
export const createRateLimiter = (windowMs: number, max: number) => {
return rateLimit({
windowMs,
max,
message: {
error: 'Too many requests',
retryAfter: Math.ceil(windowMs / 1000),
},
standardHeaders: true,
legacyHeaders: false,
handler: (req: Request, res: Response) => {
res.status(429).json({
error: 'Rate limit exceeded',
retryAfter: Math.ceil(windowMs / 1000),
});
},
});
};
// 安全头配置
export const securityHeaders = helmet({
contentSecurityPolicy: {
directives: {
defaultSrc: ["'self'"],
styleSrc: ["'self'", "'unsafe-inline'"],
scriptSrc: ["'self'"],
imgSrc: ["'self'", "data:", "https:"],
},
},
hsts: {
maxAge: 31536000,
includeSubDomains: true,
preload: true,
},
});
// 输入验证中间件
export const validateInput = (validations: any[]) => {
return async (req: Request, res: Response, next: NextFunction) => {
await Promise.all(validations.map(validation => validation.run(req)));
const errors = validationResult(req);
if (!errors.isEmpty()) {
return res.status(400).json({
error: 'Validation failed',
details: errors.array(),
});
}
next();
};
};
// API Key 验证
export const validateApiKey = (req: Request, res: Response, next: NextFunction) => {
const apiKey = req.headers['x-api-key'] as string;
if (!apiKey) {
return res.status(401).json({ error: 'API key required' });
}
// 验证 API Key 格式和有效性
if (!isValidApiKey(apiKey)) {
return res.status(403).json({ error: 'Invalid API key' });
}
next();
};
// 敏感信息过滤
export const sanitizeResponse = (req: Request, res: Response, next: NextFunction) => {
const originalSend = res.send;
res.send = function(data) {
if (typeof data === 'string') {
// 移除敏感信息
data = data.replace(/password["\s]*:["\s]*[^,}]+/gi, 'password: "[REDACTED]"');
data = data.replace(/token["\s]*:["\s]*[^,}]+/gi, 'token: "[REDACTED]"');
}
return originalSend.call(this, data);
};
next();
};
function isValidApiKey(apiKey: string): boolean {
// 实现 API Key 验证逻辑
return /^[a-zA-Z0-9]{32,}$/.test(apiKey);
}
14.20 数据加密与隐私保护
typescript
// 文件:src/security/encryption.ts
import crypto from 'crypto';
import bcrypt from 'bcrypt';
export class EncryptionService {
private readonly algorithm = 'aes-256-gcm';
private readonly keyLength = 32;
private readonly ivLength = 16;
private readonly tagLength = 16;
constructor(private readonly secretKey: string) {
if (!secretKey || secretKey.length < 32) {
throw new Error('Secret key must be at least 32 characters long');
}
}
// 加密数据
encrypt(text: string): string {
const iv = crypto.randomBytes(this.ivLength);
const cipher = crypto.createCipher(this.algorithm, this.secretKey);
cipher.setAAD(Buffer.from('additional-data'));
let encrypted = cipher.update(text, 'utf8', 'hex');
encrypted += cipher.final('hex');
const tag = cipher.getAuthTag();
return iv.toString('hex') + ':' + tag.toString('hex') + ':' + encrypted;
}
// 解密数据
decrypt(encryptedData: string): string {
const parts = encryptedData.split(':');
if (parts.length !== 3) {
throw new Error('Invalid encrypted data format');
}
const iv = Buffer.from(parts[0], 'hex');
const tag = Buffer.from(parts[1], 'hex');
const encrypted = parts[2];
const decipher = crypto.createDecipher(this.algorithm, this.secretKey);
decipher.setAAD(Buffer.from('additional-data'));
decipher.setAuthTag(tag);
let decrypted = decipher.update(encrypted, 'hex', 'utf8');
decrypted += decipher.final('utf8');
return decrypted;
}
// 哈希密码
async hashPassword(password: string): Promise<string> {
const saltRounds = 12;
return await bcrypt.hash(password, saltRounds);
}
// 验证密码
async verifyPassword(password: string, hashedPassword: string): Promise<boolean> {
return await bcrypt.compare(password, hashedPassword);
}
// 生成安全随机字符串
generateSecureToken(length: number = 32): string {
return crypto.randomBytes(length).toString('hex');
}
// 数据脱敏
maskSensitiveData(data: any, fields: string[]): any {
const masked = { ...data };
fields.forEach(field => {
if (masked[field]) {
const value = String(masked[field]);
if (value.length > 4) {
masked[field] = value.substring(0, 2) + '*'.repeat(value.length - 4) + value.substring(value.length - 2);
} else {
masked[field] = '*'.repeat(value.length);
}
}
});
return masked;
}
}
14.21 访问控制与权限管理
typescript
// 文件:src/security/rbac.ts
import { Request, Response, NextFunction } from 'express';
export enum Permission {
READ_KNOWLEDGE = 'read:knowledge',
WRITE_KNOWLEDGE = 'write:knowledge',
DELETE_KNOWLEDGE = 'delete:knowledge',
MANAGE_USERS = 'manage:users',
VIEW_ANALYTICS = 'view:analytics',
MANAGE_SYSTEM = 'manage:system',
}
export enum Role {
ADMIN = 'admin',
MANAGER = 'manager',
USER = 'user',
GUEST = 'guest',
}
export interface User {
id: string;
email: string;
role: Role;
permissions: Permission[];
}
// 角色权限映射
const rolePermissions: Record<Role, Permission[]> = {
[Role.ADMIN]: Object.values(Permission),
[Role.MANAGER]: [
Permission.READ_KNOWLEDGE,
Permission.WRITE_KNOWLEDGE,
Permission.VIEW_ANALYTICS,
],
[Role.USER]: [
Permission.READ_KNOWLEDGE,
],
[Role.GUEST]: [],
};
// 权限检查中间件
export const requirePermission = (permission: Permission) => {
return (req: Request, res: Response, next: NextFunction) => {
const user = req.user as User;
if (!user) {
return res.status(401).json({ error: 'Authentication required' });
}
if (!hasPermission(user, permission)) {
return res.status(403).json({
error: 'Insufficient permissions',
required: permission,
userRole: user.role,
});
}
next();
};
};
// 角色检查中间件
export const requireRole = (roles: Role[]) => {
return (req: Request, res: Response, next: NextFunction) => {
const user = req.user as User;
if (!user) {
return res.status(401).json({ error: 'Authentication required' });
}
if (!roles.includes(user.role)) {
return res.status(403).json({
error: 'Insufficient role',
required: roles,
userRole: user.role,
});
}
next();
};
};
// 资源所有权检查
export const requireOwnership = (resourceUserIdField: string = 'userId') => {
return (req: Request, res: Response, next: NextFunction) => {
const user = req.user as User;
const resourceUserId = req.params[resourceUserIdField] || req.body[resourceUserIdField];
if (!user) {
return res.status(401).json({ error: 'Authentication required' });
}
// 管理员可以访问所有资源
if (user.role === Role.ADMIN) {
return next();
}
// 检查资源所有权
if (user.id !== resourceUserId) {
return res.status(403).json({
error: 'Access denied: resource ownership required',
});
}
next();
};
};
function hasPermission(user: User, permission: Permission): boolean {
return user.permissions.includes(permission) ||
rolePermissions[user.role]?.includes(permission) || false;
}
// 动态权限检查
export const checkPermission = (user: User, permission: Permission): boolean => {
return hasPermission(user, permission);
};
// 权限装饰器
export const RequirePermission = (permission: Permission) => {
return function (target: any, propertyName: string, descriptor: PropertyDescriptor) {
const method = descriptor.value;
descriptor.value = function (...args: any[]) {
const req = args[0] as Request;
const user = req.user as User;
if (!hasPermission(user, permission)) {
throw new Error(`Permission denied: ${permission}`);
}
return method.apply(this, args);
};
};
};
🚀 实战项目:完整部署流程演示(约 20%)
14.22 部署脚本自动化
bash
#!/bin/bash
# 文件:scripts/deploy.sh
set -e
# 配置变量
APP_NAME="ai-knowledge-app"
NAMESPACE="ai-app"
REGISTRY="ghcr.io/your-org"
VERSION=${1:-latest}
ENVIRONMENT=${2:-production}
echo "🚀 开始部署 $APP_NAME v$VERSION 到 $ENVIRONMENT 环境"
# 1. 环境检查
echo "📋 检查部署环境..."
kubectl version --client
docker version
helm version
# 2. 构建和推送镜像
echo "🔨 构建 Docker 镜像..."
docker build -t $REGISTRY/$APP_NAME:$VERSION .
docker push $REGISTRY/$APP_NAME:$VERSION
# 3. 更新 Kubernetes 配置
echo "⚙️ 更新 Kubernetes 配置..."
envsubst < k8s/deployment.yaml | kubectl apply -f -
kubectl set image deployment/$APP_NAME $APP_NAME=$REGISTRY/$APP_NAME:$VERSION -n $NAMESPACE
# 4. 等待部署完成
echo "⏳ 等待部署完成..."
kubectl rollout status deployment/$APP_NAME -n $NAMESPACE --timeout=300s
# 5. 运行健康检查
echo "🏥 运行健康检查..."
kubectl get pods -n $NAMESPACE -l app=$APP_NAME
# 等待服务就绪
sleep 30
# 检查服务健康状态
HEALTH_URL="https://api.yourapp.com/api/health"
if curl -f $HEALTH_URL; then
echo "✅ 健康检查通过"
else
echo "❌ 健康检查失败"
exit 1
fi
# 6. 运行烟雾测试
echo "🧪 运行烟雾测试..."
npm run test:smoke
# 7. 更新监控配置
echo "📊 更新监控配置..."
kubectl apply -f monitoring/
# 8. 清理旧版本
echo "🧹 清理旧版本..."
kubectl delete pods -n $NAMESPACE -l app=$APP_NAME --field-selector=status.phase=Succeeded
echo "🎉 部署完成!"
echo "📱 应用地址: https://api.yourapp.com"
echo "📊 监控面板: https://grafana.yourapp.com"
echo "📋 日志查看: kubectl logs -f deployment/$APP_NAME -n $NAMESPACE"
14.23 回滚脚本
bash
#!/bin/bash
# 文件:scripts/rollback.sh
set -e
APP_NAME="ai-knowledge-app"
NAMESPACE="ai-app"
echo "🔄 开始回滚 $APP_NAME"
# 1. 查看部署历史
echo "📋 查看部署历史..."
kubectl rollout history deployment/$APP_NAME -n $NAMESPACE
# 2. 执行回滚
echo "⬅️ 执行回滚..."
kubectl rollout undo deployment/$APP_NAME -n $NAMESPACE
# 3. 等待回滚完成
echo "⏳ 等待回滚完成..."
kubectl rollout status deployment/$APP_NAME -n $NAMESPACE --timeout=300s
# 4. 验证回滚结果
echo "✅ 验证回滚结果..."
kubectl get pods -n $NAMESPACE -l app=$APP_NAME
# 5. 健康检查
echo "🏥 健康检查..."
sleep 30
HEALTH_URL="https://api.yourapp.com/api/health"
if curl -f $HEALTH_URL; then
echo "✅ 回滚成功,服务正常运行"
else
echo "❌ 回滚后服务异常,需要人工介入"
exit 1
fi
echo "🎉 回滚完成!"
14.24 监控仪表板配置
json
{
"dashboard": {
"id": null,
"title": "AI Knowledge App Dashboard",
"tags": ["ai", "knowledge", "production"],
"timezone": "browser",
"panels": [
{
"title": "Request Rate",
"type": "graph",
"targets": [
{
"expr": "rate(http_requests_total[5m])",
"legendFormat": "{{method}} {{route}}"
}
],
"yAxes": [
{
"label": "requests/sec"
}
]
},
{
"title": "Response Time",
"type": "graph",
"targets": [
{
"expr": "histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))",
"legendFormat": "95th percentile"
},
{
"expr": "histogram_quantile(0.50, rate(http_request_duration_seconds_bucket[5m]))",
"legendFormat": "50th percentile"
}
],
"yAxes": [
{
"label": "seconds"
}
]
},
{
"title": "Error Rate",
"type": "graph",
"targets": [
{
"expr": "rate(http_requests_total{status_code=~\"5..\"}[5m])",
"legendFormat": "5xx errors"
}
],
"yAxes": [
{
"label": "errors/sec"
}
]
},
{
"title": "AI Token Usage",
"type": "graph",
"targets": [
{
"expr": "rate(ai_tokens_used_total[1h])",
"legendFormat": "{{model}} {{type}}"
}
],
"yAxes": [
{
"label": "tokens/hour"
}
]
},
{
"title": "Memory Usage",
"type": "graph",
"targets": [
{
"expr": "container_memory_usage_bytes{container=\"ai-app\"}",
"legendFormat": "Memory Usage"
}
],
"yAxes": [
{
"label": "bytes"
}
]
},
{
"title": "CPU Usage",
"type": "graph",
"targets": [
{
"expr": "rate(container_cpu_usage_seconds_total{container=\"ai-app\"}[5m])",
"legendFormat": "CPU Usage"
}
],
"yAxes": [
{
"label": "cores"
}
]
}
],
"time": {
"from": "now-1h",
"to": "now"
},
"refresh": "30s"
}
}
⚙️ 最佳实践与常见问题(约 10%)
14.25 性能优化最佳实践
🚀 应用层优化
- 使用连接池管理数据库连接
- 实现智能缓存策略(Redis + 内存缓存)
- 启用 Gzip 压缩减少传输大小
- 使用 CDN 加速静态资源
📊 监控优化
- 设置合理的告警阈值,避免告警疲劳
- 使用分层监控:基础设施 → 应用 → 业务
- 定期审查和清理无用的指标
- 建立监控数据的保留策略
🔒 安全优化
- 定期更新依赖包,修复安全漏洞
- 使用最小权限原则配置访问控制
- 实施零信任网络架构
- 定期进行安全审计和渗透测试
14.26 常见问题与解决方案
❓ 问题1:部署后服务无法启动
bash
# 诊断步骤
kubectl describe pod <pod-name> -n ai-app
kubectl logs <pod-name> -n ai-app
kubectl get events -n ai-app --sort-by='.lastTimestamp'
# 常见原因和解决方案
# 1. 环境变量配置错误
kubectl get configmap ai-app-config -n ai-app -o yaml
kubectl get secret ai-app-secrets -n ai-app -o yaml
# 2. 资源限制过小
kubectl top pods -n ai-app
kubectl describe pod <pod-name> -n ai-app | grep -A 5 "Limits\|Requests"
# 3. 健康检查配置错误
kubectl get pod <pod-name> -n ai-app -o yaml | grep -A 10 "livenessProbe\|readinessProbe"
❓ 问题2:AI 请求响应缓慢
typescript
// 诊断和优化代码
export class PerformanceDiagnostics {
static async diagnoseSlowRequests() {
// 1. 检查 API 响应时间
const apiMetrics = await this.getAPIMetrics();
console.log('API Response Times:', apiMetrics);
// 2. 检查模型加载时间
const modelMetrics = await this.getModelMetrics();
console.log('Model Loading Times:', modelMetrics);
// 3. 检查缓存命中率
const cacheMetrics = await this.getCacheMetrics();
console.log('Cache Hit Rate:', cacheMetrics);
// 4. 检查数据库查询性能
const dbMetrics = await this.getDatabaseMetrics();
console.log('Database Performance:', dbMetrics);
}
static async optimizePerformance() {
// 1. 启用模型预热
await this.warmupModels();
// 2. 优化缓存策略
await this.optimizeCache();
// 3. 调整并发限制
await this.adjustConcurrency();
// 4. 启用请求批处理
await this.enableBatching();
}
}
❓ 问题3:内存泄漏问题
typescript
// 内存监控和清理
export class MemoryManager {
private static memoryThreshold = 0.8; // 80% 内存使用率阈值
static startMemoryMonitoring() {
setInterval(() => {
const usage = process.memoryUsage();
const usagePercent = usage.heapUsed / usage.heapTotal;
if (usagePercent > this.memoryThreshold) {
console.warn('High memory usage detected:', usage);
this.performGarbageCollection();
}
}, 30000); // 每30秒检查一次
}
static performGarbageCollection() {
if (global.gc) {
global.gc();
console.log('Garbage collection performed');
} else {
console.warn('Garbage collection not available');
}
}
static clearCaches() {
// 清理应用缓存
// 清理 AI 模型缓存
// 清理数据库连接池
}
}
14.27 灾难恢复与备份策略
bash
#!/bin/bash
# 文件:scripts/backup.sh
set -e
BACKUP_DIR="/backups/$(date +%Y%m%d_%H%M%S)"
mkdir -p $BACKUP_DIR
echo "💾 开始备份..."
# 1. 备份数据库
echo "📊 备份数据库..."
kubectl exec -n ai-app deployment/postgres -- pg_dump -U postgres ai_app > $BACKUP_DIR/database.sql
# 2. 备份配置文件
echo "⚙️ 备份配置文件..."
kubectl get configmap -n ai-app -o yaml > $BACKUP_DIR/configmaps.yaml
kubectl get secret -n ai-app -o yaml > $BACKUP_DIR/secrets.yaml
# 3. 备份持久化数据
echo "💿 备份持久化数据..."
kubectl exec -n ai-app deployment/redis -- redis-cli BGSAVE
kubectl cp ai-app/redis-pod:/data/dump.rdb $BACKUP_DIR/redis-dump.rdb
# 4. 备份应用代码和配置
echo "📁 备份应用配置..."
cp -r k8s/ $BACKUP_DIR/
cp -r monitoring/ $BACKUP_DIR/
cp docker-compose.yml $BACKUP_DIR/
# 5. 压缩备份文件
echo "🗜️ 压缩备份文件..."
tar -czf $BACKUP_DIR.tar.gz -C /backups $(basename $BACKUP_DIR)
rm -rf $BACKUP_DIR
# 6. 上传到云存储
echo "☁️ 上传到云存储..."
aws s3 cp $BACKUP_DIR.tar.gz s3://your-backup-bucket/backups/
echo "✅ 备份完成: $BACKUP_DIR.tar.gz"
📚 本章总结
通过本章学习,我们全面掌握了:
✅ 理论基础
- AI 应用部署的复杂性和挑战
- 现代部署架构的演进历程
- DevOps 在 AI 应用中的特殊实践
✅ 技术实践
- Docker 容器化技术的最佳实践
- 完整的 CI/CD 流水线构建
- 全面的监控和可观测性体系
- 企业级安全防护措施
✅ 实战经验
- 完整的 Kubernetes 部署配置
- 服务网格和微服务架构
- 自动化部署和回滚脚本
- 灾难恢复和备份策略
✅ 最佳实践
- 性能优化和问题诊断
- 常见问题的解决方案
- 监控告警的合理配置
- 安全防护的全面实施
🎯 下章预告
下一章《AI 应用安全与伦理实践》中,我们将深入探讨:
- AI 应用的安全威胁与防护策略
- 数据隐私保护与合规性要求
- AI 伦理原则与负责任开发
- 安全审计与风险评估方法
最后感谢阅读!欢迎关注我,微信公众号:
《鲫小鱼不正经》
。欢迎点赞、收藏、关注,一键三连!!!