瓶让好焕三个关键点

在深入技术细节前，我们先要明确设计千万级系统的核心目标。

记住这三个关键点：

image

高性能：不是简单追求快，而是要在保证正确性的前提下，用有限的资源处理尽可能多的请求。我们的目标是核心接口P99响应时间低于100毫秒，单机QPS不低于5000。
高可用：系统需要具备故障自愈能力。我们追求的是"两个9"打底，"三个9"起步，"四个9"努力的目标（即99.99%可用性，全年不可用时间不超过53分钟）。
可扩展：系统要能随着业务增长而平滑扩展，且扩展成本要可控。这里包括水平扩展（加机器）和垂直扩展（优化单机性能）两个维度。

02 架构演进：从单体到千万级的四步走

让我们从一个最简单的电商系统开始，看看它是如何一步步演进到支撑千万级流量的。

阶段一：单机单体架构（日请求<10万）

// 最简单的Spring Boot单体应用

@SpringBootApplication

@RestController

public class MonolithicApp {

@Autowired

private ProductService productService;

@Autowired

private OrderService orderService;

@GetMapping("/product/{id}")

public Product getProduct(@PathVariable Long id) {

return productService.getById(id);

}

@PostMapping("/order")

public Order createOrder(@RequestBody OrderRequest request) {

return orderService.createOrder(request);

}

public static void main(String\[\] args) {

SpringApplication.run(MonolithicApp.class, args);

}

问题分析：

所有服务都在一个JVM进程中，一个bug可能导致整个系统崩溃

数据库成为单点瓶颈，无法水平扩展

发布时需要停机，影响用户体验

阶段二：垂直拆分（日请求10万-100万）

当系统压力增大时，我们首先进行垂直拆分：

// 商品服务 - 独立部署

@SpringBootApplication

@RestController

@RequestMapping("/product")

public class ProductServiceApp {

@GetMapping("/{id}")

public Product getProduct(@PathVariable Long id) {

// 直接查询数据库

return productRepository.findById(id).orElse(null);

}

// 订单服务 - 独立部署

@SpringBootApplication

@RestController

@RequestMapping("/order")

public class OrderServiceApp {

@PostMapping("/")

public Order createOrder(@RequestBody OrderRequest request) {

// 通过HTTP调用商品服务

Product product = restTemplate.getForObject(

"http://product-service/product/" + request.getProductId(),

Product.class

);

// 创建订单逻辑

return orderRepository.save(order);

}

关键改进：

数据库按业务拆分：商品库、订单库分离

服务独立部署，故障隔离

可以针对不同服务进行针对性优化

阶段三：水平扩展与服务治理（日请求100万-500万）

当单实例无法承载流量时，我们开始水平扩展：

Kubernetes部署配置文件示例

apiVersion: apps/v1

kind: Deployment

metadata:

name: product-service

spec:

replicas: 3 # 3个实例

selector:

matchLabels:

app: product-service

template:

metadata:

labels:

app: product-service

spec:

containers:

name: product-service

image: product-service:latest

resources:

limits:

memory: "512Mi"

cpu: "500m"

readinessProbe: # 就绪探针

httpGet:

path: /actuator/health

port: 8080

initialDelaySeconds: 30

periodSeconds: 10

livenessProbe: # 存活探针

httpGet:

path: /actuator/health

port: 8080

initialDelaySeconds: 60

periodSeconds: 10

apiVersion: v1

kind: Service

metadata:

name: product-service

spec:

selector:

app: product-service

ports:

port: 80

targetPort: 8080

type: ClusterIP

关键改进：

服务多实例部署，负载均衡

引入服务注册与发现（如Nacos、Consul）

增加健康检查，实现故障自动转移

配置中心统一管理配置

阶段四：全链路优化（日请求500万以上）

当系统规模继续扩大，我们需要更精细的优化：

image

这个架构图展示了一个完整的千万级系统架构。

接下来，我们深入每个关键组件。

03 负载均衡：流量分发的高可用之道

负载均衡是千万级系统的第一道防线。我们通常采用多层负载均衡策略：

四层负载均衡（LVS/ELB）

LVS DR模式配置示例

真实服务器配置回环接口

ifconfig lo:0 192.168.1.100 netmask 255.255.255.255 up

route add -host 192.168.1.100 dev lo:0

配置ARP抑制

echo "1" > /proc/sys/net/ipv4/conf/lo/arp_ignore

echo "2" > /proc/sys/net/ipv4/conf/lo/arp_announce

echo "1" > /proc/sys/net/ipv4/conf/all/arp_ignore

echo "2" > /proc/sys/net/ipv4/conf/all/arp_announce

LVS Director配置

ipvsadm -A -t 192.168.1.100:80 -s wrr

ipvsadm -a -t 192.168.1.100:80 -r 192.168.1.10:8080 -g -w 1

ipvsadm -a -t 192.168.1.100:80 -r 192.168.1.11:8080 -g -w 2

七层负载均衡（Nginx/API网关）

Nginx负载均衡配置

upstream backend_servers {

加权轮询，配合健康检查

server 192.168.1.10:8080 weight=3 max_fails=3 fail_timeout=30s;

server 192.168.1.11:8080 weight=2 max_fails=3 fail_timeout=30s;

server 192.168.1.12:8080 weight=1 max_fails=3 fail_timeout=30s;

会话保持（需要时开启）

备份服务器

server 192.168.1.13:8080 backup;

}

server {

listen 80;

server_name api.example.com;

限流配置

limit_req_zone $binary_remote_addr zone=api_limit:10m rate=100r/s;

location /api/ {

应用限流

limit_req zone=api_limit burst=50 nodelay;

连接超时设置

proxy_connect_timeout 3s;

proxy_read_timeout 10s;

proxy_send_timeout 10s;

负载均衡

proxy_pass http://backend_servers;

故障转移

proxy_next_upstream error timeout invalid_header http_500 http_502 http_503;

proxy_next_upstream_timeout 0;

proxy_next_upstream_tries 3;

添加代理头

proxy_set_header Host $host;

proxy_set_header X-Real-IP $remote_addr;

proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

}

健康检查端点

location /health {

access_log off;

return 200 "healthy\n";

}

现代API网关（Spring Cloud Gateway）

@Configuration

public class GatewayConfig {

@Bean

public RouteLocator customRouteLocator(RouteLocatorBuilder builder) {

return builder.routes()

.route("product_route", r -> r

.path("/api/product/**")

.filters(f -> f

.requestRateLimiter(config -> config

.setRateLimiter(redisRateLimiter())

.setKeyResolver(ipKeyResolver()))

.circuitBreaker(config -> config

.setName("productCircuitBreaker")

.setFallbackUri("forward:/fallback/product"))

.rewritePath("/api/(?.*)", "/${segment}")

)

.uri("lb://product-service"))

.route("order_route", r -> r

.path("/api/order/**")

.filters(f -> f

.requestRateLimiter(config -> config

.setRateLimiter(redisRateLimiter())

.setKeyResolver(userKeyResolver()))

.retry(config -> config

.setRetries(3)

.setStatuses(HttpStatus.INTERNAL_SERVER_ERROR))

)

.uri("lb://order-service"))

.build();

}

@Bean

public RedisRateLimiter redisRateLimiter() {

// 每秒10个请求，突发容量20

return new RedisRateLimiter(10, 20, 1);

}

@Bean

public KeyResolver ipKeyResolver() {

return exchange -> Mono.just(

exchange.getRequest().getRemoteAddress().getAddress().getHostAddress()

);

}

04 缓存策略：性能加速的智能分层

缓存是提升系统性能最有效的手段之一。千万级系统需要设计智能的多级缓存策略。

四级缓存架构

image

本地缓存（Caffeine）

@Component

public class LocalCacheManager {

// 一级缓存：Guava Cache（适合较小数据量）

private final Cache guavaCache = CacheBuilder.newBuilder()

.maximumSize(10000)

.expireAfterWrite(5, TimeUnit.MINUTES)

.recordStats()

.build();

// 二级缓存：Caffeine（高性能，W-TinyLFU算法）

private final Cache caffeineCache = Caffeine.newBuilder()

.maximumSize(50000)

.expireAfterWrite(10, TimeUnit.MINUTES)

.refreshAfterWrite(1, TimeUnit.MINUTES)

.recordStats()

.build();

// 热点数据特殊缓存（如秒杀商品）

private final Cache hotDataCache = Caffeine.newBuilder()

.maximumSize(1000)

.expireAfterWrite(30, TimeUnit.SECONDS) // 热点数据过期快

.recordStats()

.build();

public T getWithCache(String key, Class clazz,

Supplier loader, CacheLevel level) {

switch (level) {

case LOCAL_HOT:

return getFromHotCache(key, clazz, loader);

case LOCAL_NORMAL:

return getFromLocalCache(key, clazz, loader);

case DISTRIBUTED:

return getFromDistributedCache(key, clazz, loader);

default:

return loader.get();

}

private T getFromLocalCache(String key, Class clazz, Supplier loader) {

try {

return (T) caffeineCache.get(key, k -> {

T value = loader.get();

if (value == null) {

// 缓存空值，防止缓存穿透

return new NullValue();

}

return value;

});

} catch (Exception e) {

// 本地缓存异常，降级直接加载

return loader.get();

}

// 使用示例

@Service

public class ProductService {

@Autowired

private LocalCacheManager cacheManager;

@Autowired

private RedisTemplate redisTemplate;

public Product getProductWithCache(Long productId) {

String cacheKey = "product:" + productId;

return cacheManager.getWithCache(cacheKey, Product.class, () -> {

// 先查Redis分布式缓存

Product product = redisTemplate.opsForValue().get(cacheKey);

if (product != null) {

return product;

}

// Redis没有，查数据库

product = productRepository.findById(productId).orElse(null);

if (product != null) {

// 异步回写到Redis，不阻塞当前请求

CompletableFuture.runAsync(() ->

redisTemplate.opsForValue().set(cacheKey, product, 1, TimeUnit.HOURS)

);

}

return product;

}, CacheLevel.LOCAL_NORMAL);

}

分布式缓存（Redis集群）

@Configuration

public class RedisConfig {

@Bean

public RedisConnectionFactory redisConnectionFactory() {

RedisClusterConfiguration clusterConfig = new RedisClusterConfiguration();

// 集群节点配置

clusterConfig.addClusterNode(new RedisNode("192.168.1.100", 6379));

clusterConfig.addClusterNode(new RedisNode("192.168.1.101", 6379));

clusterConfig.addClusterNode(new RedisNode("192.168.1.102", 6379));

// 集群配置

clusterConfig.setMaxRedirects(3); // 最大重定向次数

return new JedisConnectionFactory(clusterConfig);

}

@Bean

public RedisTemplate redisTemplate() {

RedisTemplate template = new RedisTemplate<>();

template.setConnectionFactory(redisConnectionFactory());

// 使用String序列化器

template.setKeySerializer(new StringRedisSerializer());

template.setValueSerializer(new GenericJackson2JsonRedisSerializer());

// 开启事务支持

template.setEnableTransactionSupport(true);

return template;

}

@Bean

public RedisCacheManager cacheManager(RedisConnectionFactory connectionFactory) {

RedisCacheConfiguration config = RedisCacheConfiguration.defaultCacheConfig()

.entryTtl(Duration.ofMinutes(30)) // 默认过期时间

.disableCachingNullValues() // 不缓存null值

.serializeKeysWith(RedisSerializationContext.SerializationPair

.fromSerializer(new StringRedisSerializer()))

.serializeValuesWith(RedisSerializationContext.SerializationPair

.fromSerializer(new GenericJackson2JsonRedisSerializer()));

// 不同缓存区域的不同配置

Map cacheConfigurations = new HashMap<>();

cacheConfigurations.put("product", config.entryTtl(Duration.ofHours(1)));

cacheConfigurations.put("user", config.entryTtl(Duration.ofDays(1)));

return RedisCacheManager.builder(connectionFactory)

.cacheDefaults(config)

.withInitialCacheConfigurations(cacheConfigurations)

.transactionAware()

.build();

}

缓存策略与问题解决

@Service

public class CacheStrategyService {

/**

* 防止缓存穿透：缓存空值

public Product getProductWithNullCache(Long productId) {

String cacheKey = "product:" + productId;

String nullCacheKey = "product_null:" + productId;

// 先检查空值缓存

if (Boolean.TRUE.equals(redisTemplate.hasKey(nullCacheKey))) {

return null; // 知道是空值，直接返回，不查数据库

}

Product product = redisTemplate.opsForValue().get(cacheKey);

if (product != null) {

return product;

}

// 加分布式锁，防止缓存击穿

String lockKey = "lock:product:" + productId;

boolean locked = false;

try {

locked = tryLock(lockKey, 3, TimeUnit.SECONDS);

if (locked) {

// 双重检查

product = redisTemplate.opsForValue().get(cacheKey);

if (product != null) {

return product;

}

// 查询数据库

product = productRepository.findById(productId).orElse(null);

if (product == null) {

// 缓存空值，过期时间短

redisTemplate.opsForValue().set(nullCacheKey, "NULL", 5, TimeUnit.MINUTES);

} else {

// 缓存数据

redisTemplate.opsForValue().set(cacheKey, product, 1, TimeUnit.HOURS);

}

return product;

} else {

// 未获取到锁，短暂等待后重试或返回降级数据

Thread.sleep(100);

return getProductWithNullCache(productId);

}

} catch (InterruptedException e) {

Thread.currentThread().interrupt();

return null;

} finally {

if (locked) {

releaseLock(lockKey);

}

/**

* 防止缓存雪崩：随机过期时间

public void setWithRandomExpire(String key, Object value, long baseExpire, TimeUnit unit) {

long expireTime = unit.toMillis(baseExpire);

// 增加随机偏移量（±20%）

double randomFactor = 0.8 + Math.random() * 0.4; // 0.8 ~ 1.2

long actualExpire = (long) (expireTime * randomFactor);

redisTemplate.opsForValue().set(

key, value, actualExpire, TimeUnit.MILLISECONDS

);

}

/**

* 热点数据发现与自动缓存

@Scheduled(fixedDelay = 60000) // 每分钟执行一次

public void discoverHotData() {

// 从Redis统计访问频率

Set hotKeys = findHotKeys();

for (String key : hotKeys) {

// 将热点数据加载到本地缓存

Object value = redisTemplate.opsForValue().get(key);

if (value != null) {

localCache.put(key, value);

}

C# 弃元模式：从语法糖到性能利器的深度解析

Kubernetes部署配置文件示例

LVS DR模式配置示例

真实服务器配置回环接口

配置ARP抑制

LVS Director配置

Nginx负载均衡配置

加权轮询，配合健康检查

会话保持（需要时开启）

sticky cookie srv_id expires=1h domain=.example.com path=/;

备份服务器

限流配置

应用限流

连接超时设置

负载均衡

故障转移

添加代理头

健康检查端点