Flink Keyed State 详解之二

1. 基本概念

Operator State(算子状态)是 Flink 中另一种重要的状态类型。与 Keyed State 不同,Operator State 与特定的算子实例相关联,而不是与特定的 key 相关联。Operator State 的特点是:

  • 作用域限定在算子实例上
  • 每个并行算子实例都有其独立的状态副本
  • 状态会根据算子的并行度自动重新分布

2. 适用场景

Operator State 适用于以下场景:

  1. Kafka 消费者偏移量管理
  2. 需要维护算子级别状态的场景
  3. 广播状态模式
  4. 需要在算子实例间共享状态的场景

3. Operator State 类型

Flink 提供了多种 Operator State 类型:

3.1 ListState

ListState 是 Operator State 中最常用的状态类型,用于存储元素列表。

java 复制代码
import org.apache.flink.api.common.state.ListState;
import org.apache.flink.api.common.state.ListStateDescriptor;
import org.apache.flink.configuration.Configuration;
import org.apache.flink.runtime.state.FunctionInitializationContext;
import org.apache.flink.runtime.state.FunctionSnapshotContext;
import org.apache.flink.streaming.api.checkpoint.CheckpointedFunction;
import org.apache.flink.streaming.api.functions.source.SourceFunction;

/**
 * Operator ListState 使用示例
 * 模拟一个带有状态的 SourceFunction
 */
public class StatefulSourceFunction implements SourceFunction<String>, CheckpointedFunction {
    private volatile boolean isRunning = true;
    private ListState<Long> checkpointedCount;
    private long count = 0L;

    @Override
    public void run(SourceContext<String> ctx) throws Exception {
        while (isRunning && count < 1000) {
            synchronized (ctx.getCheckpointLock()) {
                ctx.collect("Element: " + count);
                count++;
                Thread.sleep(100); // 模拟处理时间
            }
        }
    }

    @Override
    public void cancel() {
        isRunning = false;
    }

    @Override
    public void snapshotState(FunctionSnapshotContext context) throws Exception {
        checkpointedCount.clear();
        checkpointedCount.add(count);
    }

    @Override
    public void initializeState(FunctionInitializationContext context) throws Exception {
        ListStateDescriptor<Long> descriptor = new ListStateDescriptor<>(
            "count-state",
            Long.class
        );
        
        checkpointedCount = context.getOperatorStateStore().getListState(descriptor);
        
        if (context.isRestored()) {
            for (Long countValue : checkpointedCount.get()) {
                count = countValue;
            }
        }
    }
}

3.2 UnionListState

UnionListState 与 ListState 类似,但在恢复时会将所有并行实例的状态合并。

java 复制代码
import org.apache.flink.api.common.state.ListState;
import org.apache.flink.api.common.state.ListStateDescriptor;
import org.apache.flink.configuration.Configuration;
import org.apache.flink.runtime.state.FunctionInitializationContext;
import org.apache.flink.runtime.state.FunctionSnapshotContext;
import org.apache.flink.streaming.api.checkpoint.CheckpointedFunction;
import org.apache.flink.streaming.api.functions.sink.RichSinkFunction;
import java.util.ArrayList;
import java.util.List;

/**
 * UnionListState 使用示例
 * 维护所有并行实例处理的元素列表
 */
public class UnionListStateSink extends RichSinkFunction<String> implements CheckpointedFunction {
    private List<String> processedElements;
    private ListState<String> checkpointedState;

    @Override
    public void open(Configuration parameters) throws Exception {
        processedElements = new ArrayList<>();
    }

    @Override
    public void invoke(String value, Context context) throws Exception {
        processedElements.add(value);
        System.out.println("Processed: " + value + " by subtask " + getRuntimeContext().getIndexOfThisSubtask());
    }

    @Override
    public void snapshotState(FunctionSnapshotContext context) throws Exception {
        checkpointedState.clear();
        for (String element : processedElements) {
            checkpointedState.add(element);
        }
    }

    @Override
    public void initializeState(FunctionInitializationContext context) throws Exception {
        ListStateDescriptor<String> descriptor = new ListStateDescriptor<>(
            "union-list-state",
            String.class
        );

        // 获取 UnionListState
        checkpointedState = context.getOperatorStateStore().getUnionListState(descriptor);

        if (context.isRestored()) {
            processedElements.clear();
            for (String element : checkpointedState.get()) {
                processedElements.add(element);
            }
        }
    }
}

3.3 BroadcastState<K, V>

BroadcastState 用于广播状态模式,允许将状态从一个流广播到所有并行实例。

java 复制代码
import org.apache.flink.api.common.state.MapStateDescriptor;
import org.apache.flink.api.common.state.BroadcastState;
import org.apache.flink.api.common.state.ReadOnlyBroadcastState;
import org.apache.flink.streaming.api.datastream.BroadcastStream;
import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.api.functions.co.BroadcastProcessFunction;
import org.apache.flink.util.Collector;

/**
 * BroadcastState 使用示例
 * 实现规则广播和数据处理
 */
public class BroadcastStateExample {
    public static void main(String[] args) throws Exception {
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        
        // 创建规则流(需要广播)
        DataStream<Rule> rulesStream = env.fromElements(
            new Rule("rule1", "pattern1"),
            new Rule("rule2", "pattern2")
        );
        
        // 创建数据流
        DataStream<Event> dataStream = env.fromElements(
            new Event("event1", "pattern1"),
            new Event("event2", "pattern2"),
            new Event("event3", "pattern1")
        );
        
        // 创建广播状态描述符
        MapStateDescriptor<String, Rule> ruleStateDescriptor = new MapStateDescriptor<>(
            "RulesBroadcastState",
            String.class,
            Rule.class
        );
        
        // 将规则流转换为广播流
        BroadcastStream<Rule> broadcastRules = rulesStream.broadcast(ruleStateDescriptor);
        
        // 连接数据流和广播流进行处理
        DataStream<String> output = dataStream
            .connect(broadcastRules)
            .process(new RuleBroadcastProcessFunction());
            
        output.print();
        
        env.execute("Broadcast State Example");
    }
    
    /**
     * 广播状态处理函数
     */
    public static class RuleBroadcastProcessFunction extends BroadcastProcessFunction<Event, Rule, String> {
        private final MapStateDescriptor<String, Rule> ruleStateDescriptor;
        
        public RuleBroadcastProcessFunction() {
            this.ruleStateDescriptor = new MapStateDescriptor<>(
                "RulesBroadcastState",
                String.class,
                Rule.class
            );
        }
        
        // 处理广播的规则元素
        @Override
        public void processBroadcastElement(Rule rule, Context ctx, Collector<String> out) throws Exception {
            BroadcastState<String, Rule> broadcastState = ctx.getBroadcastState(ruleStateDescriptor);
            // 将规则存储到广播状态中
            broadcastState.put(rule.getName(), rule);
        }
        
        // 处理数据流元素
        @Override
        public void processElement(Event event, ReadOnlyContext ctx, Collector<String> out) throws Exception {
            // 从广播状态中获取规则
            ReadOnlyBroadcastState<String, Rule> broadcastState = ctx.getBroadcastState(ruleStateDescriptor);
            
            // 遍历所有规则并匹配事件
            for (Map.Entry<String, Rule> entry : broadcastState.immutableEntries()) {
                Rule rule = entry.getValue();
                if (rule.getPattern().equals(event.getPattern())) {
                    out.collect("Event " + event.getName() + " matches rule " + rule.getName());
                }
            }
        }
    }
    
    // 规则类
    public static class Rule {
        private String name;
        private String pattern;
        
        public Rule() {}
        
        public Rule(String name, String pattern) {
            this.name = name;
            this.pattern = pattern;
        }
        
        public String getName() { return name; }
        public String getPattern() { return pattern; }
    }
    
    // 事件类
    public static class Event {
        private String name;
        private String pattern;
        
        public Event() {}
        
        public Event(String name, String pattern) {
            this.name = name;
            this.pattern = pattern;
        }
        
        public String getName() { return name; }
        public String getPattern() { return pattern; }
    }
}

4. 配置方法

4.1 状态描述符

Operator State 使用相应的描述符:

java 复制代码
// ListStateDescriptor
ListStateDescriptor<String> listDescriptor = new ListStateDescriptor<>(
    "operator-list-state",
    String.class
);

// BroadcastStateDescriptor(通过 MapStateDescriptor 创建)
MapStateDescriptor<String, String> broadcastDescriptor = new MapStateDescriptor<>(
    "broadcast-state",
    String.class,
    String.class
);

4.2 状态后端配置

Operator State 的存储依赖于状态后端:

java 复制代码
import org.apache.flink.contrib.streaming.state.EmbeddedRocksDBStateBackend;
import org.apache.flink.runtime.state.hashmap.HashMapStateBackend;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;

// 配置 HashMap 状态后端(默认)
StreamExecutionEnvironment env1 = StreamExecutionEnvironment.getExecutionEnvironment();
env1.setStateBackend(new HashMapStateBackend());

// 配置 RocksDB 状态后端(适用于大状态)
StreamExecutionEnvironment env2 = StreamExecutionEnvironment.getExecutionEnvironment();
env2.setStateBackend(new EmbeddedRocksDBStateBackend());

5. CheckpointedFunction 接口

实现 Operator State 的主要方式是实现 CheckpointedFunction 接口:

java 复制代码
import org.apache.flink.runtime.state.FunctionInitializationContext;
import org.apache.flink.runtime.state.FunctionSnapshotContext;
import org.apache.flink.streaming.api.checkpoint.CheckpointedFunction;

public class MyCheckpointedFunction implements CheckpointedFunction {
    private ListState<Integer> checkpointedState;
    private int count = 0;

    @Override
    public void snapshotState(FunctionSnapshotContext context) throws Exception {
        // 快照状态时调用
        checkpointedState.clear();
        checkpointedState.add(count);
    }

    @Override
    public void initializeState(FunctionInitializationContext context) throws Exception {
        // 初始化状态时调用
        ListStateDescriptor<Integer> descriptor = new ListStateDescriptor<>(
            "checkpointed-count",
            Integer.class
        );
        
        checkpointedState = context.getOperatorStateStore().getListState(descriptor);
        
        if (context.isRestored()) {
            // 从检查点恢复状态
            for (Integer countValue : checkpointedState.get()) {
                count = countValue;
            }
        }
    }
}

6. 完整使用示例

java 复制代码
import org.apache.flink.api.common.state.ListState;
import org.apache.flink.api.common.state.ListStateDescriptor;
import org.apache.flink.configuration.Configuration;
import org.apache.flink.runtime.state.FunctionInitializationContext;
import org.apache.flink.runtime.state.FunctionSnapshotContext;
import org.apache.flink.streaming.api.checkpoint.CheckpointedFunction;
import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.api.functions.source.RichParallelSourceFunction;
import java.util.ArrayList;
import java.util.List;
import java.util.Random;

/**
 * Operator State 完整使用示例
 * 实现一个带有状态的并行数据源
 */
public class OperatorStateExample {
    public static void main(String[] args) throws Exception {
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        env.enableCheckpointing(5000); // 每5秒进行一次检查点
        
        DataStream<Integer> numbers = env.addSource(new ParallelNumberSource(100));
        numbers.print();
        
        env.execute("Operator State Example");
    }
    
    /**
     * 并行数字源函数
     * 每个并行实例维护自己的状态
     */
    public static class ParallelNumberSource extends RichParallelSourceFunction<Integer> implements CheckpointedFunction {
        private final int maxCount;
        private volatile boolean isRunning = true;
        private int currentCount = 0;
        private ListState<Integer> checkpointedCount;
        private Random random = new Random();
        
        public ParallelNumberSource(int maxCount) {
            this.maxCount = maxCount;
        }
        
        @Override
        public void run(SourceContext<Integer> ctx) throws Exception {
            while (isRunning && currentCount < maxCount) {
                synchronized (ctx.getCheckpointLock()) {
                    ctx.collect(currentCount);
                    currentCount++;
                    Thread.sleep(100 + random.nextInt(200)); // 随机延迟
                }
            }
        }
        
        @Override
        public void cancel() {
            isRunning = false;
        }
        
        @Override
        public void snapshotState(FunctionSnapshotContext context) throws Exception {
            checkpointedCount.clear();
            checkpointedCount.add(currentCount);
        }
        
        @Override
        public void initializeState(FunctionInitializationContext context) throws Exception {
            ListStateDescriptor<Integer> descriptor = new ListStateDescriptor<>(
                "number-source-count",
                Integer.class
            );
            
            checkpointedCount = context.getOperatorStateStore().getListState(descriptor);
            
            if (context.isRestored()) {
                // 从检查点恢复状态
                for (Integer count : checkpointedCount.get()) {
                    currentCount = count;
                }
                System.out.println("Subtask " + getRuntimeContext().getIndexOfThisSubtask() + 
                    " restored count to " + currentCount);
            } else {
                System.out.println("Subtask " + getRuntimeContext().getIndexOfThisSubtask() + 
                    " starting with count " + currentCount);
            }
        }
    }
}

7. 最佳实践建议

  1. 合理选择状态类型

    • 一般状态管理使用 ListState
    • 需要合并状态时使用 UnionListState
    • 广播场景使用 BroadcastState
  2. 状态命名规范

    • 使用有意义的名称
    • 避免重复名称
    • 建议使用小写字母和连字符
  3. 状态管理

    • 在 snapshotState 中正确保存状态
    • 在 initializeState 中正确恢复状态
    • 处理并行度变化时的状态重分布
  4. 性能优化

    • 避免在状态中存储大量数据
    • 合理设置检查点间隔
    • 考虑使用 RocksDB 状态后端处理大状态
  5. 容错处理

    • 确保状态操作的幂等性
    • 处理状态恢复时的异常情况
    • 定期检查点以保证状态一致性
  6. 广播状态注意事项

    • 广播状态只允许在 processBroadcastElement 中修改
    • 在 processElement 中只能读取广播状态
    • 注意广播状态的内存使用

通过合理使用 Operator State,可以有效地在 Flink 应用程序中维护和处理与特定算子实例相关联的状态信息,实现复杂的状态管理和计算逻辑。

相关推荐
老蒋新思维1 小时前
创客匠人峰会新视角:AI 时代知识变现的 “组织化转型”—— 从个人 IP 到 “AI+IP” 组织的增长革命
大数据·人工智能·网络协议·tcp/ip·创始人ip·创客匠人·知识变现
TMO Group 探谋网络科技1 小时前
AI Agent工作原理:如何连接数据、决策与行动,助力企业数字化转型?
大数据·人工智能·ai
Chasing Aurora2 小时前
Git 工程指引(命令+问题)
大数据·git·elasticsearch·团队开发·互联网大厂
TG:@yunlaoda360 云老大3 小时前
阿里云国际站代理商RPA跨境服务的适用场景有哪些?
大数据·阿里云·rpa
微盛企微增长小知识3 小时前
2025企业微信服务商测评:头部服务商微盛AI·企微管家技术实力与落地效果解析
大数据·人工智能·企业微信
TMO Group 探谋网络科技3 小时前
AI电商的应用:Magento 使用 Adobe 生成式 AI改造7大业务场景
大数据·人工智能·adobe·ai
UI设计兰亭妙微3 小时前
理性数据,温柔体验:北京兰亭妙微解码 Hydra Corps. 企业管理界面的 “松弛感设计”
大数据·人工智能·用户体验设计
慎独4133 小时前
家家有:从单向支出到价值循环,绿色积分如何 重构商业逻辑?
大数据·人工智能
梦里不知身是客114 小时前
flink使用 DefaultResourceCalculator(默认资源计算器) 策略
大数据·flink
五度易链-区域产业数字化管理平台4 小时前
十五五规划明确数智方向数字经济将迎新变化,五度易链大数据、AI技术如何从单点应用走向全域赋能
大数据·人工智能