flink的TwoPhaseCommitSinkFunction怎么做才能提供精准一次保证

背景

TwoPhaseCommitSinkFunction是flink中基于二阶段事务提交和检查点机制配合使用实现的精准一次的输出数据汇，但是想要实现精准一次的输出，实际使用中需要注意几个方面，否则不仅仅达不到精准一次输出，反而可能导致数据丢失，连至少一次的语义都不能达到

TwoPhaseCommitSinkFunction注意事项

TwoPhaseCommitSinkFunction是通过在两阶段提交协议实现的事务，大概简化为一下步骤：

1 在收到检查点分隔符的时候，开启事务，并把记录都写到开启的事务中，

开始进行状态的保存时，把检查点id对应的事务结束掉，做好准备提交的准备，并开启下一个事务

java 复制代码

public void snapshotState(FunctionSnapshotContext context) throws Exception {
        // this is like the pre-commit of a 2-phase-commit transaction
        // we are ready to commit and remember the transaction

        checkState(
                currentTransactionHolder != null,
                "bug: no transaction object when performing state snapshot");

        long checkpointId = context.getCheckpointId();
        LOG.debug(
                "{} - checkpoint {} triggered, flushing transaction '{}'",
                name(),
                context.getCheckpointId(),
                currentTransactionHolder);
		//当前检查点对应的事务做好准备，比如进行stream.flush等，准备好提交事务
        preCommit(currentTransactionHolder.handle);
        // 把当前检查点id对应的事务添加到状态中
        pendingCommitTransactions.put(checkpointId, currentTransactionHolder);
        LOG.debug("{} - stored pending transactions {}", name(), pendingCommitTransactions);

        currentTransactionHolder = beginTransactionInternal();
        LOG.debug("{} - started new transaction '{}'", name(), currentTransactionHolder);
        // 把当前检查点id对应的事务添加到状态中
        state.clear();
        state.add(
                new State<>(
                        this.currentTransactionHolder,
                        new ArrayList<>(pendingCommitTransactions.values()),
                        userContext));
    }

收到检查点完成的通知notify方法，提交第二步中检查点id对应的事务，注意这一步不是每次flink在进行检查点的时候都会通知，这种情况下，某一次的notify方法就需要把前几次的事务一起进行提交了，另外，如果提交某个检查点的事务失败，那么应用会重启，并且在重启后的initSnapshot方法中再次进行事务提交，如果还是失败，这个过程一直持续

java 复制代码

    public final void notifyCheckpointComplete(long checkpointId) throws Exception {
        Iterator<Map.Entry<Long, TransactionHolder<TXN>>> pendingTransactionIterator =
                pendingCommitTransactions.entrySet().iterator();
        Throwable firstError = null;

        while (pendingTransactionIterator.hasNext()) {
            Map.Entry<Long, TransactionHolder<TXN>> entry = pendingTransactionIterator.next();
            Long pendingTransactionCheckpointId = entry.getKey();
            TransactionHolder<TXN> pendingTransaction = entry.getValue();
            if (pendingTransactionCheckpointId > checkpointId) {
                continue;
            }

            LOG.info(
                    "{} - checkpoint {} complete, committing transaction {} from checkpoint {}",
                    name(),
                    checkpointId,
                    pendingTransaction,
                    pendingTransactionCheckpointId);

            logWarningIfTimeoutAlmostReached(pendingTransaction);
            try {
            // 提交事务
                commit(pendingTransaction.handle);
            } catch (Throwable t) {
            //事务失败时记录异常，后面会把异常抛出导致应用重启
                if (firstError == null) {
                    firstError = t;
                }
            }

            LOG.debug("{} - committed checkpoint transaction {}", name(), pendingTransaction);
          // 事务成功后移除当前的事务
            pendingTransactionIterator.remove();
        }

        if (firstError != null) {
        // 事务提交失败会抛出异常，导致job异常中止
            throw new FlinkRuntimeException(
                    "Committing one of transactions failed, logging first encountered failure",
                    firstError);
        }
    }

总结：

1。事务不能提交失败，如果失败会导致作业失败然后重新提交，如果最终没有成功提交，那么数据会丢失

2。数据库服务端的事务超时时间不能设置太短，不能仅仅大于检查点的间隔大小，原因是上面说的，flink有可能丢失检查点完成后的通知消息，所以服务端的事务超时时间要设置的足够大.