准备手写Simple Raft(三）日志复制——一致性检查

作者：20年IT老兵，15年架构经验，专注分布式系统与中间件开发。现服务于一家金融公司，手写过各类中间件，对Jacoco、夜莺等开源项目有深度二次开发经验。

完整代码已开源： gitee.com/sh_wangwanb... 本章代码分支：ch3

上一篇我们跑通了Leader选举，3个节点能正常选出Leader，心跳也能维持。但说实话，那只是Raft的开胃菜。

选举解决的是"谁来当老大"，但老大选出来之后干什么？怎么协调大家干活？这才是Raft的核心问题。

回顾第一篇说的：Raft的本质是保证所有节点按相同顺序执行相同的操作。现在我们有了Leader，但还没有"操作"和"顺序"。

这一篇我们来实现日志复制的第一部分：一致性检查。

为什么需要日志

先想一个场景：3个客户端同时向Leader发请求。

ini 复制代码

Client-1 → Leader: SET name=alice
Client-2 → Leader: SET name=bob  
Client-3 → Leader: DELETE name

如果Leader直接把这些操作广播给Follower，会出什么问题？

每个Follower收到请求的顺序可能不一样。网络延迟、丢包、重传，各种情况都可能发生。最后三个节点的状态可能完全不同：

ini 复制代码

Node-1: name=alice
Node-2: name=bob
Node-3: (空)

这就乱套了。

日志就是用来解决这个问题的。所有操作先写入日志，日志有严格的顺序（index），所有节点按相同的日志顺序执行操作，最终状态就一定一致。

flowchart LR subgraph 并发请求 C1[Client-1: SET alice] C2[Client-2: SET bob] C3[Client-3: DELETE] end subgraph Leader日志 L1["[1] SET alice"] L2["[2] SET bob"] L3["[3] DELETE"] end C1 --> L1 C2 --> L2 C3 --> L3 L1 --> F1["Follower按序执行"] L2 --> F1 L3 --> F1

简单说，日志把时间维度的并发，转化成了空间维度的顺序。这是Raft最核心的设计思想。

日志长什么样

我们先定义日志的结构，其实很简单：

java 复制代码

public class LogEntry {
    private long index;      // 日志索引，从1开始
    private long term;       // 创建这条日志时的Leader任期
    private byte[] command;  // 具体的业务指令
}

三个字段就够了。index是日志的位置，term是创建时的任期号，command是具体要执行的操作。

为什么要存term？后面讲一致性检查的时候会用到，这是判断日志是否冲突的关键。

存储接口也不复杂：

java 复制代码

public interface LogStorage {
    void append(LogEntry entry);              // 追加日志
    LogEntry getEntry(long index);            // 按索引读取
    long getLastIndex();                      // 最后一条日志的索引
    long getLastTerm();                       // 最后一条日志的term
    void truncateFrom(long index);            // 截断：删除index及之后的日志
    List<LogEntry> getEntries(long from, long to);  // 范围读取
}

这一版我们先用内存实现，后面再考虑持久化：

java 复制代码

public class MemoryLogStorage implements LogStorage {
    private final Map<Long, LogEntry> logs = new HashMap<>();
    private long lastIndex = 0;
    private long lastTerm = 0;
    
    @Override
    public void append(LogEntry entry) {
        logs.put(entry.getIndex(), entry);
        lastIndex = entry.getIndex();
        lastTerm = entry.getTerm();
    }
    
    // 其他方法省略...
}

用Map存日志，key是index。简单粗暴，但够用了。

一致性检查：这才是难点

日志结构定义好了，接下来的问题是：Leader怎么把日志复制给Follower？

最简单的想法是：Leader直接把日志发给Follower，Follower收到就追加。但这样有个问题------如果Follower的日志和Leader不一致怎么办？

比如这种情况：

ini 复制代码

Leader:     [1,term=1] [2,term=1] [3,term=2] [4,term=2] [5,term=2]
Follower-A: [1,term=1] [2,term=1] [3,term=2] [4,term=2] [5,term=2]  ← 完全一致
Follower-B: [1,term=1] [2,term=1] [3,term=2]                        ← 落后了
Follower-C: [1,term=1] [2,term=1] [3,term=1]                        ← 冲突了！

Follower-B只是落后，补上就行。但Follower-C的问题大了------它的index=3的日志term是1，而Leader的是2。这说明这条日志是不同Leader在不同任期写入的，内容可能完全不一样。

这种情况下，Follower-C必须删掉index=3及之后的日志，然后从Leader重新复制。

Raft用一个很巧妙的方法来检测这种不一致：prevLogIndex和prevLogTerm。

sequenceDiagram participant L as Leader participant F as Follower Note over L: 要发送index=4的日志 L->>F: AppendEntries
prevLogIndex=3, prevLogTerm=2
entries=[index=4] Note over F: 检查本地index=3的日志 alt term匹配 F->>F: 追加index=4 F-->>L: success=true else term不匹配或不存在 F->>F: 拒绝，等待回退 F-->>L: success=false end

Leader发送日志时，会带上"前一条日志"的index和term。Follower收到后，先检查自己本地index=prevLogIndex的日志，term是否等于prevLogTerm。

如果匹配，说明这个位置之前的日志都是一致的（这是一个递归的保证），可以安全追加新日志。

如果不匹配，说明日志有冲突，Follower拒绝这次请求。Leader收到拒绝后，会把prevLogIndex往前退一格，再试一次。一直退到两边一致的位置，然后从那里开始重新复制。

扩展AppendEntries请求

上一篇的AppendEntries只用来发心跳，现在要扩展一下，让它能携带日志：

java 复制代码

public class AppendEntriesRequest {
    private long term;           // Leader的任期（上一篇已有）
    private String leaderId;     // Leader的ID（上一篇已有）
    
    // 新增的字段
    private long prevLogIndex;   // 前一条日志的索引
    private long prevLogTerm;    // 前一条日志的term
    private List<LogEntry> entries;  // 要追加的日志，心跳时为null
    private long leaderCommit;   // Leader的commitIndex
}

响应也要扩展，加上冲突信息：

java 复制代码

public class AppendEntriesResponse {
    private long term;           // 当前任期
    private boolean success;     // 是否成功
    
    // 新增：冲突信息，用于快速回退
    private Long conflictIndex;  // 冲突的位置
    private Long conflictTerm;   // 冲突的term
}

conflictIndex和conflictTerm是优化用的。如果Follower发现冲突，可以告诉Leader"我这个位置的term是多少"，Leader就能更快地找到一致的位置，不用一条一条往回退。

Follower处理日志的逻辑

这是整个日志复制最核心的代码，我们一步一步来：

java 复制代码

public AppendEntriesResponse handleAppendEntries(AppendEntriesRequest req) {
    lock.lock();
    try {
        // 1. 拒绝过期的请求
        if (req.getTerm() < currentTerm) {
            return new AppendEntriesResponse(currentTerm, false);
        }
        
        // 2. 发现更高或相等的term，转为Follower
        if (req.getTerm() >= currentTerm) {
            becomeFollower(req.getTerm());
            leaderId = req.getLeaderId();
            resetElectionTimeout();
        }
        
        // 3. 一致性检查（核心！）
        if (req.getPrevLogIndex() > 0) {
            LogEntry localPrev = logStorage.getEntry(req.getPrevLogIndex());
            
            // 本地没有这条日志
            if (localPrev == null) {
                return new AppendEntriesResponse(currentTerm, false);
            }
            
            // 本地有，但term不匹配------Loss
            if (localPrev.getTerm() != req.getPrevLogTerm()) {
                // 删掉冲突的日志
                logStorage.truncateFrom(req.getPrevLogIndex());
                return new AppendEntriesResponse(currentTerm, false, 
                    req.getPrevLogIndex(), localPrev.getTerm());
            }
        }
        
        // 4. 追加日志
        if (req.getEntries() != null && !req.getEntries().isEmpty()) {
            for (LogEntry entry : req.getEntries()) {
                LogEntry existing = logStorage.getEntry(entry.getIndex());
                
                if (existing == null) {
                    // 新日志，直接追加
                    logStorage.append(entry);
                } else if (existing.getTerm() != entry.getTerm()) {
                    // 冲突，删掉旧的，追加新的
                    logStorage.truncateFrom(entry.getIndex());
                    logStorage.append(entry);
                }
                // 如果term相同，说明是重复的，跳过
            }
        }
        
        // 5. 更新commitIndex
        if (req.getLeaderCommit() > commitIndex) {
            commitIndex = Math.min(req.getLeaderCommit(), logStorage.getLastIndex());
        }
        
        return new AppendEntriesResponse(currentTerm, true);
        
    } finally {
        lock.unlock();
    }
}

整个流程如下表所示：

检查项	条件	处理
term检查	term < currentTerm	拒绝，返回false
term检查	term >= currentTerm	转为Follower，继续
一致性检查	prevLogIndex = 0	跳过，直接追加
一致性检查	本地无prevLogIndex	拒绝，日志落后
一致性检查	term不匹配	删除冲突日志，拒绝
一致性检查	term匹配	追加新日志
提交	leaderCommit > commitIndex	更新commitIndex

这里有几个要注意的地方：

prevLogIndex=0的情况：这表示要追加的是第一条日志，没有"前一条"，直接跳过一致性检查。

冲突时要truncate：发现term不匹配，必须删掉那条日志及之后的所有日志。因为后面的日志可能都是基于那条错误的日志追加的。

重复日志要跳过：如果本地已经有相同index和term的日志，说明是重复请求，不用处理。

Leader端的逻辑

Leader需要维护每个Follower的复制进度：

java 复制代码

// 成为Leader时初始化
public void becomeLeader() {
    state = LEADER;
    leaderId = nodeId;
    
    long lastLogIndex = logStorage.getLastIndex();
    for (String peer : peers) {
        // nextIndex：下一条要发送的日志索引
        nextIndex.put(peer, lastLogIndex + 1);
        // matchIndex：已确认复制成功的最高索引
        matchIndex.put(peer, 0);
    }
}

nextIndex初始化为Leader最后一条日志的下一个位置。这是一个乐观的假设------假设Follower的日志和Leader一样新。如果不是，后面会通过一致性检查逐步回退。

发送日志的逻辑：

java 复制代码

public void sendAppendEntries(String peer) {
    long nextIdx = nextIndex.get(peer);
    long prevLogIndex = nextIdx - 1;
    long prevLogTerm = 0;
    
    if (prevLogIndex > 0) {
        LogEntry prevLog = logStorage.getEntry(prevLogIndex);
        if (prevLog != null) {
            prevLogTerm = prevLog.getTerm();
        }
    }
    
    // 获取要发送的日志
    List<LogEntry> entries = logStorage.getEntries(nextIdx, logStorage.getLastIndex());
    
    AppendEntriesRequest request = new AppendEntriesRequest(
        currentTerm, nodeId,
        prevLogIndex, prevLogTerm,
        entries, commitIndex
    );
    
    AppendEntriesResponse response = rpcClient.appendEntries(peer, request);
    
    if (response == null) return;  // 网络失败，下次重试
    
    if (response.getTerm() > currentTerm) {
        becomeFollower(response.getTerm());  // 发现更高term，降级
        return;
    }
    
    if (response.isSuccess()) {
        // 成功，更新进度
        if (!entries.isEmpty()) {
            long lastNewIndex = entries.get(entries.size() - 1).getIndex();
            nextIndex.put(peer, lastNewIndex + 1);
            matchIndex.put(peer, lastNewIndex);
        }
    } else {
        // 失败，回退重试
        nextIndex.put(peer, Math.max(1, nextIdx - 1));
    }
}

这里的回退策略比较简单，每次失败就往回退一格。实际上可以利用conflictIndex和conflictTerm来快速回退，但这一版先用简单的方式。

跑起来看看

代码写完了，跑个测试验证一下。

启动3节点集群：

bash 复制代码

./start-cluster.sh

查看状态，找到Leader：

bash 复制代码

$ curl http://localhost:8081/raft/status
{"currentTerm":9,"state":"LEADER","nodeId":"node-1","lastLogIndex":0}

$ curl http://localhost:8082/raft/status
{"currentTerm":9,"state":"FOLLOWER","nodeId":"node-2","lastLogIndex":0}

$ curl http://localhost:8083/raft/status
{"currentTerm":9,"state":"FOLLOWER","nodeId":"node-3","lastLogIndex":0}

node-1是Leader，term=9，目前没有日志（lastLogIndex=0）。

向Leader发送一条日志：

bash 复制代码

$ curl -X POST http://localhost:8081/raft/append \
  -H "Content-Type: application/json" \
  -d '{
    "term": 9,
    "leaderId": "node-1",
    "prevLogIndex": 0,
    "prevLogTerm": 0,
    "entries": [{"index": 1, "term": 9, "command": "aGVsbG8tcmFmdA=="}],
    "leaderCommit": 0
  }'

{"term":9,"success":true,"conflictIndex":0,"conflictTerm":0}

成功了！再查一下Leader的状态：

bash 复制代码

$ curl http://localhost:8081/raft/status
{"currentTerm":9,"state":"LEADER","nodeId":"node-1","lastLogIndex":1}

lastLogIndex变成1了，说明日志已经追加成功。

整个测试流程的时序：

sequenceDiagram participant Client as 测试脚本 participant N1 as node-1
(Leader) participant N2 as node-2
(Follower) participant N3 as node-3
(Follower) Note over Client,N3: 1. 查询状态，识别Leader Client->>N1: GET /raft/status N1-->>Client: state=LEADER, term=9 Note over Client,N3: 2. 向Leader发送日志 Client->>N1: POST /raft/append
entries=[index=1, term=9] Note over N1: 一致性检查通过
prevLogIndex=0 Note over N1: 追加日志
lastLogIndex: 0→1 N1-->>Client: success=true Note over Client,N3: 3. Leader通过心跳复制给Follower loop 每50ms N1->>N2: AppendEntries(心跳) N2-->>N1: success=true N1->>N3: AppendEntries(心跳) N3-->>N1: success=true end Note over Client,N3: 4. 验证各节点状态 Client->>N1: GET /raft/status N1-->>Client: lastLogIndex=1

这一版还缺什么

说实话，这只是日志复制的一半。我们实现了：

日志结构和存储
一致性检查机制
Follower接收和追加日志
Leader维护复制进度

但还缺几个关键的东西：

过半提交：Leader追加日志后，要等多数节点确认才能提交。现在我们只是追加了，但没有真正"提交"。

状态机应用：日志提交后要应用到状态机。现在日志只是存着，没有执行。

自动复制：现在是手动通过curl发日志，实际应该是Leader收到客户端请求后自动复制给Follower。

这些留到下一篇。

几个要注意的地方

写这块代码的时候踩了几个坑，记录一下：

1. prevLogIndex=0不能查日志

一开始我写的是直接查logStorage.getEntry(prevLogIndex)，结果prevLogIndex=0时返回null，然后就拒绝了。其实prevLogIndex=0表示没有前置日志，应该直接跳过检查。

2. truncate要从冲突位置开始

发现term不匹配时，要把冲突位置及之后的日志全删掉。我一开始只删了冲突那一条，后面的日志还在，结果越搞越乱。

3. 重复日志不要重复追加

如果收到的日志index和term都和本地一样，说明是重复请求（网络重传之类的），直接跳过。我一开始每次都追加，结果日志越来越多。

4. nextIndex初始化很重要

Leader刚当选时，不知道Follower的日志进度，只能乐观地假设和自己一样。然后通过一致性检查逐步修正。这个设计挺巧妙的。

如果你也在学习Raft，或者对日志复制有什么疑问，欢迎在评论区交流。有问题也可以直接留言，看到会回复。

下一篇我们实现过半提交和状态机，那时候才算真正理解了Raft的核心。

代码已开源，欢迎Star：gitee.com/sh_wangwanb...

准备手写Simple Raft(三） 日志复制——一致性检查

为什么需要日志

日志长什么样

一致性检查：这才是难点

扩展AppendEntries请求

Follower处理日志的逻辑

Leader端的逻辑

跑起来看看

这一版还缺什么

几个要注意的地方

准备手写Simple Raft(三）日志复制——一致性检查