瓶让好焕三个关键点
在深入技术细节前,我们先要明确设计千万级系统的核心目标。
记住这三个关键点:
image
-
高性能:不是简单追求快,而是要在保证正确性的前提下,用有限的资源处理尽可能多的请求。我们的目标是核心接口P99响应时间低于100毫秒,单机QPS不低于5000。
-
高可用:系统需要具备故障自愈能力。我们追求的是"两个9"打底,"三个9"起步,"四个9"努力的目标(即99.99%可用性,全年不可用时间不超过53分钟)。
-
可扩展:系统要能随着业务增长而平滑扩展,且扩展成本要可控。这里包括水平扩展(加机器)和垂直扩展(优化单机性能)两个维度。
02 架构演进:从单体到千万级的四步走
让我们从一个最简单的电商系统开始,看看它是如何一步步演进到支撑千万级流量的。
阶段一:单机单体架构(日请求<10万)
// 最简单的Spring Boot单体应用
@SpringBootApplication
@RestController
public class MonolithicApp {
@Autowired
private ProductService productService;
@Autowired
private OrderService orderService;
@GetMapping("/product/{id}")
public Product getProduct(@PathVariable Long id) {
return productService.getById(id);
}
@PostMapping("/order")
public Order createOrder(@RequestBody OrderRequest request) {
return orderService.createOrder(request);
}
public static void main(String[] args) {
SpringApplication.run(MonolithicApp.class, args);
}
}
问题分析:
所有服务都在一个JVM进程中,一个bug可能导致整个系统崩溃
数据库成为单点瓶颈,无法水平扩展
发布时需要停机,影响用户体验
阶段二:垂直拆分(日请求10万-100万)
当系统压力增大时,我们首先进行垂直拆分:
// 商品服务 - 独立部署
@SpringBootApplication
@RestController
@RequestMapping("/product")
public class ProductServiceApp {
@GetMapping("/{id}")
public Product getProduct(@PathVariable Long id) {
// 直接查询数据库
return productRepository.findById(id).orElse(null);
}
}
// 订单服务 - 独立部署
@SpringBootApplication
@RestController
@RequestMapping("/order")
public class OrderServiceApp {
@PostMapping("/")
public Order createOrder(@RequestBody OrderRequest request) {
// 通过HTTP调用商品服务
Product product = restTemplate.getForObject(
"http://product-service/product/" + request.getProductId(),
Product.class
);
// 创建订单逻辑
return orderRepository.save(order);
}
}
关键改进:
数据库按业务拆分:商品库、订单库分离
服务独立部署,故障隔离
可以针对不同服务进行针对性优化
阶段三:水平扩展与服务治理(日请求100万-500万)
当单实例无法承载流量时,我们开始水平扩展:
Kubernetes部署配置文件示例
apiVersion: apps/v1
kind: Deployment
metadata:
name: product-service
spec:
replicas: 3 # 3个实例
selector:
matchLabels:
app: product-service
template:
metadata:
labels:
app: product-service
spec:
containers:
- name: product-service
image: product-service:latest
resources:
limits:
memory: "512Mi"
cpu: "500m"
readinessProbe: # 就绪探针
httpGet:
path: /actuator/health
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
livenessProbe: # 存活探针
httpGet:
path: /actuator/health
port: 8080
initialDelaySeconds: 60
periodSeconds: 10
apiVersion: v1
kind: Service
metadata:
name: product-service
spec:
selector:
app: product-service
ports:
- port: 80
targetPort: 8080
type: ClusterIP
关键改进:
服务多实例部署,负载均衡
引入服务注册与发现(如Nacos、Consul)
增加健康检查,实现故障自动转移
配置中心统一管理配置
阶段四:全链路优化(日请求500万以上)
当系统规模继续扩大,我们需要更精细的优化:
image
image
这个架构图展示了一个完整的千万级系统架构。
接下来,我们深入每个关键组件。
03 负载均衡:流量分发的高可用之道
负载均衡是千万级系统的第一道防线。我们通常采用多层负载均衡策略:
四层负载均衡(LVS/ELB)
LVS DR模式配置示例
真实服务器配置回环接口
ifconfig lo:0 192.168.1.100 netmask 255.255.255.255 up
route add -host 192.168.1.100 dev lo:0
配置ARP抑制
echo "1" > /proc/sys/net/ipv4/conf/lo/arp_ignore
echo "2" > /proc/sys/net/ipv4/conf/lo/arp_announce
echo "1" > /proc/sys/net/ipv4/conf/all/arp_ignore
echo "2" > /proc/sys/net/ipv4/conf/all/arp_announce
LVS Director配置
ipvsadm -A -t 192.168.1.100:80 -s wrr
ipvsadm -a -t 192.168.1.100:80 -r 192.168.1.10:8080 -g -w 1
ipvsadm -a -t 192.168.1.100:80 -r 192.168.1.11:8080 -g -w 2
七层负载均衡(Nginx/API网关)
Nginx负载均衡配置
upstream backend_servers {
加权轮询,配合健康检查
server 192.168.1.10:8080 weight=3 max_fails=3 fail_timeout=30s;
server 192.168.1.11:8080 weight=2 max_fails=3 fail_timeout=30s;
server 192.168.1.12:8080 weight=1 max_fails=3 fail_timeout=30s;
会话保持(需要时开启)
sticky cookie srv_id expires=1h domain=.example.com path=/;
备份服务器
server 192.168.1.13:8080 backup;
}
server {
listen 80;
server_name api.example.com;
限流配置
limit_req_zone $binary_remote_addr zone=api_limit:10m rate=100r/s;
location /api/ {
应用限流
limit_req zone=api_limit burst=50 nodelay;
连接超时设置
proxy_connect_timeout 3s;
proxy_read_timeout 10s;
proxy_send_timeout 10s;
负载均衡
proxy_pass http://backend_servers;
故障转移
proxy_next_upstream error timeout invalid_header http_500 http_502 http_503;
proxy_next_upstream_timeout 0;
proxy_next_upstream_tries 3;
添加代理头
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}
健康检查端点
location /health {
access_log off;
return 200 "healthy\n";
}
}
现代API网关(Spring Cloud Gateway)
@Configuration
public class GatewayConfig {
@Bean
public RouteLocator customRouteLocator(RouteLocatorBuilder builder) {
return builder.routes()
.route("product_route", r -> r
.path("/api/product/**")
.filters(f -> f
.requestRateLimiter(config -> config
.setRateLimiter(redisRateLimiter())
.setKeyResolver(ipKeyResolver()))
.circuitBreaker(config -> config
.setName("productCircuitBreaker")
.setFallbackUri("forward:/fallback/product"))
.rewritePath("/api/(?.*)", "/${segment}")
)
.uri("lb://product-service"))
.route("order_route", r -> r
.path("/api/order/**")
.filters(f -> f
.requestRateLimiter(config -> config
.setRateLimiter(redisRateLimiter())
.setKeyResolver(userKeyResolver()))
.retry(config -> config
.setRetries(3)
.setStatuses(HttpStatus.INTERNAL_SERVER_ERROR))
)
.uri("lb://order-service"))
.build();
}
@Bean
public RedisRateLimiter redisRateLimiter() {
// 每秒10个请求,突发容量20
return new RedisRateLimiter(10, 20, 1);
}
@Bean
public KeyResolver ipKeyResolver() {
return exchange -> Mono.just(
exchange.getRequest().getRemoteAddress().getAddress().getHostAddress()
);
}
}
04 缓存策略:性能加速的智能分层
缓存是提升系统性能最有效的手段之一。千万级系统需要设计智能的多级缓存策略。
四级缓存架构
image
- 本地缓存(Caffeine)
@Component
public class LocalCacheManager {
// 一级缓存:Guava Cache(适合较小数据量)
private final Cache guavaCache = CacheBuilder.newBuilder()
.maximumSize(10000)
.expireAfterWrite(5, TimeUnit.MINUTES)
.recordStats()
.build();
// 二级缓存:Caffeine(高性能,W-TinyLFU算法)
private final Cache caffeineCache = Caffeine.newBuilder()
.maximumSize(50000)
.expireAfterWrite(10, TimeUnit.MINUTES)
.refreshAfterWrite(1, TimeUnit.MINUTES)
.recordStats()
.build();
// 热点数据特殊缓存(如秒杀商品)
private final Cache hotDataCache = Caffeine.newBuilder()
.maximumSize(1000)
.expireAfterWrite(30, TimeUnit.SECONDS) // 热点数据过期快
.recordStats()
.build();
public T getWithCache(String key, Class clazz,
Supplier loader, CacheLevel level) {
switch (level) {
case LOCAL_HOT:
return getFromHotCache(key, clazz, loader);
case LOCAL_NORMAL:
return getFromLocalCache(key, clazz, loader);
case DISTRIBUTED:
return getFromDistributedCache(key, clazz, loader);
default:
return loader.get();
}
}
private T getFromLocalCache(String key, Class clazz, Supplier loader) {
try {
return (T) caffeineCache.get(key, k -> {
T value = loader.get();
if (value == null) {
// 缓存空值,防止缓存穿透
return new NullValue();
}
return value;
});
} catch (Exception e) {
// 本地缓存异常,降级直接加载
return loader.get();
}
}
}
// 使用示例
@Service
public class ProductService {
@Autowired
private LocalCacheManager cacheManager;
@Autowired
private RedisTemplate redisTemplate;
public Product getProductWithCache(Long productId) {
String cacheKey = "product:" + productId;
return cacheManager.getWithCache(cacheKey, Product.class, () -> {
// 先查Redis分布式缓存
Product product = redisTemplate.opsForValue().get(cacheKey);
if (product != null) {
return product;
}
// Redis没有,查数据库
product = productRepository.findById(productId).orElse(null);
if (product != null) {
// 异步回写到Redis,不阻塞当前请求
CompletableFuture.runAsync(() ->
redisTemplate.opsForValue().set(cacheKey, product, 1, TimeUnit.HOURS)
);
}
return product;
}, CacheLevel.LOCAL_NORMAL);
}
}
- 分布式缓存(Redis集群)
@Configuration
public class RedisConfig {
@Bean
public RedisConnectionFactory redisConnectionFactory() {
RedisClusterConfiguration clusterConfig = new RedisClusterConfiguration();
// 集群节点配置
clusterConfig.addClusterNode(new RedisNode("192.168.1.100", 6379));
clusterConfig.addClusterNode(new RedisNode("192.168.1.101", 6379));
clusterConfig.addClusterNode(new RedisNode("192.168.1.102", 6379));
// 集群配置
clusterConfig.setMaxRedirects(3); // 最大重定向次数
return new JedisConnectionFactory(clusterConfig);
}
@Bean
public RedisTemplate redisTemplate() {
RedisTemplate template = new RedisTemplate<>();
template.setConnectionFactory(redisConnectionFactory());
// 使用String序列化器
template.setKeySerializer(new StringRedisSerializer());
template.setValueSerializer(new GenericJackson2JsonRedisSerializer());
// 开启事务支持
template.setEnableTransactionSupport(true);
return template;
}
@Bean
public RedisCacheManager cacheManager(RedisConnectionFactory connectionFactory) {
RedisCacheConfiguration config = RedisCacheConfiguration.defaultCacheConfig()
.entryTtl(Duration.ofMinutes(30)) // 默认过期时间
.disableCachingNullValues() // 不缓存null值
.serializeKeysWith(RedisSerializationContext.SerializationPair
.fromSerializer(new StringRedisSerializer()))
.serializeValuesWith(RedisSerializationContext.SerializationPair
.fromSerializer(new GenericJackson2JsonRedisSerializer()));
// 不同缓存区域的不同配置
Map cacheConfigurations = new HashMap<>();
cacheConfigurations.put("product", config.entryTtl(Duration.ofHours(1)));
cacheConfigurations.put("user", config.entryTtl(Duration.ofDays(1)));
return RedisCacheManager.builder(connectionFactory)
.cacheDefaults(config)
.withInitialCacheConfigurations(cacheConfigurations)
.transactionAware()
.build();
}
}
- 缓存策略与问题解决
@Service
public class CacheStrategyService {
/**
* 防止缓存穿透:缓存空值
*/
public Product getProductWithNullCache(Long productId) {
String cacheKey = "product:" + productId;
String nullCacheKey = "product_null:" + productId;
// 先检查空值缓存
if (Boolean.TRUE.equals(redisTemplate.hasKey(nullCacheKey))) {
return null; // 知道是空值,直接返回,不查数据库
}
Product product = redisTemplate.opsForValue().get(cacheKey);
if (product != null) {
return product;
}
// 加分布式锁,防止缓存击穿
String lockKey = "lock:product:" + productId;
boolean locked = false;
try {
locked = tryLock(lockKey, 3, TimeUnit.SECONDS);
if (locked) {
// 双重检查
product = redisTemplate.opsForValue().get(cacheKey);
if (product != null) {
return product;
}
// 查询数据库
product = productRepository.findById(productId).orElse(null);
if (product == null) {
// 缓存空值,过期时间短
redisTemplate.opsForValue().set(nullCacheKey, "NULL", 5, TimeUnit.MINUTES);
} else {
// 缓存数据
redisTemplate.opsForValue().set(cacheKey, product, 1, TimeUnit.HOURS);
}
return product;
} else {
// 未获取到锁,短暂等待后重试或返回降级数据
Thread.sleep(100);
return getProductWithNullCache(productId);
}
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
return null;
} finally {
if (locked) {
releaseLock(lockKey);
}
}
}
/**
* 防止缓存雪崩:随机过期时间
*/
public void setWithRandomExpire(String key, Object value, long baseExpire, TimeUnit unit) {
long expireTime = unit.toMillis(baseExpire);
// 增加随机偏移量(±20%)
double randomFactor = 0.8 + Math.random() * 0.4; // 0.8 ~ 1.2
long actualExpire = (long) (expireTime * randomFactor);
redisTemplate.opsForValue().set(
key, value, actualExpire, TimeUnit.MILLISECONDS
);
}
/**
* 热点数据发现与自动缓存
*/
@Scheduled(fixedDelay = 60000) // 每分钟执行一次
public void discoverHotData() {
// 从Redis统计访问频率
Set hotKeys = findHotKeys();
for (String key : hotKeys) {
// 将热点数据加载到本地缓存
Object value = redisTemplate.opsForValue().get(key);
if (value != null) {
localCache.put(key, value);
}
}
}
}