flink 状态参数设置

前提

代码示例,通过flink消费kafka,查看list状态中的数据,确定参数的具体含义

kafka的代码:发送两个key值,一秒发送一次

复制代码
	for(int i = 0; i< 100; i++){
	    JSONObject object = new JSONObject();
	    object.put("id", 1);
	    object.put("value", i);
	    String s = object.toJSONString();
	    kafkaProducer.send(new ProducerRecord("test_topic_partition_one", s.getBytes(StandardCharsets.UTF_8))).get();
	
	    object = new JSONObject();
	    object.put("id", 2);
	    object.put("value", 100 + i);
	    s = object.toJSONString();
	    kafkaProducer.send(new ProducerRecord("test_topic_partition_one", s.getBytes(StandardCharsets.UTF_8))).get();
	    Thread.sleep(1000);
	}

flink消费kafka示例:

复制代码
	final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
	env.enableCheckpointing(10 * 1000);
	KafkaSource<String> source = KafkaSource.<String>builder()
	                .setBootstrapServers("broker:9092")
	                .setProperties(properties)
	                .setTopics("test_topic_partition_one")
	                .setGroupId("my-group")
	                .setStartingOffsets(OffsetsInitializer.latest())
	                .setValueOnlyDeserializer(new SimpleStringSchema())
	                .build();
	DataStreamSource<String> kafkaSource = env
		.fromSource(source, WatermarkStrategy.noWatermarks(), "Kafka Source")
		.setParallelism(2);
	
	DataStream<Tuple2<String, Integer>> dataStream = kafkaSource.map(
	                new MapFunction<String, Tuple2<String, Integer>>() {
	                    @Override
	                    public Tuple2<String, Integer> map(String value) throws Exception {
	                        JSONObject object = JSONObject.parseObject(value);
	                        return new Tuple2<String, Integer>(object.getString("id"), object.getInteger("value"));
	                    }
	                });
	
	DataStream<String> resultStream = dataStream
	                .keyBy(value -> value.f0) // 根据第一个字段(键)进行分组
	                .process(new ListValueProcess());
	
	 // 打印结果
	 resultStream.print();

ListValueProcess状态函数

复制代码
 	@Override
    public void processElement(Tuple2<String, Integer> value, KeyedProcessFunction<String, Tuple2<String, Integer>, String>.Context ctx, Collector<String> out) throws Exception {
        // 添加元素到 ListState
        listState.add(value.f1);

        // 获取 ListState 中的所有元素,并输出它们
        String key = value.f0;
        List<Integer> list = new ArrayList<>();
        for (Integer integer : listState.get()) {
            list.add(integer);
        }
        String result = "key:" + key + ", value:" +list;
        // 输出结果
        out.collect(result);
    }
    @Override
    public void open(Configuration parameters) throws Exception {
        super.open(parameters);

        StateTtlConfig ttlConfig = StateTtlConfig
                .newBuilder(Time.seconds(10))
                .setUpdateType(StateTtlConfig.UpdateType.OnCreateAndWrite)
                .setStateVisibility(StateTtlConfig.StateVisibility.ReturnExpiredIfNotCleanedUp)
                .build();


        // 初始化 ListState
        // 不同的key 具有不用的listState
        // 用于存储一个key多个值
        ListStateDescriptor<Integer> integerListStateDescriptor = new ListStateDescriptor<>("my-list-state", Integer.class);
        integerListStateDescriptor.enableTimeToLive(ttlConfig);

        listState = getRuntimeContext().getListState(integerListStateDescriptor);

    }

可以看到StateTtlConfig大部份有三个参数

  • 指定状态保存时间
  • setUpdateType 设置状态更新策略:OnCreateAndWriteOnReadAndWrite
  • setStateVisibility 设置状态可见行 :ReturnExpiredIfNotCleanedUpNeverReturnExpired

这里我们保存状态时间是10s
OnCreateAndWrite: 表示当状态被创建与更新的时候,表示更新了状态
OnReadAndWrite:表示状态被创建与更新和读取的时候,表示更新了状态
ReturnExpiredIfNotCleanedUp:表示状态过期了但没有删除,也可以读取到状态
NeverReturnExpired:表示状态过期就读取不到

结果示例:

当:OnCreateAndWriteReturnExpiredIfNotCleanedUp

复制代码
1> key:1, value:[6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24]
1> key:2, value:[113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124]
1> key:1, value:[6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25]
1> key:2, value:[113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125]
1> key:1, value:[6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26]
1> key:2, value:[113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126]
1> key:1, value:[18, 19, 20, 21, 22, 23, 24, 25, 26, 27]
1> key:2, value:[113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127]
1> key:1, value:[18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28]
1> key:2, value:[113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128]
1> key:1, value:[18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29]
1> key:2, value:[113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129]
1> key:1, value:[18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30]
1> key:2, value:[113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130]

可以看到,状态会定期删除过期的数据,而且数据可见可能大于10s的范围。

OnCreateAndWriteNeverReturnExpired

复制代码
1> key:2, value:[109, 110, 111, 112, 113, 114, 115, 116, 117]
1> key:1, value:[10, 11, 12, 13, 14, 15, 16, 17, 18]
1> key:2, value:[110, 111, 112, 113, 114, 115, 116, 117, 118]
1> key:1, value:[11, 12, 13, 14, 15, 16, 17, 18, 19]
1> key:2, value:[111, 112, 113, 114, 115, 116, 117, 118, 119]
1> key:1, value:[12, 13, 14, 15, 16, 17, 18, 19, 20]
1> key:2, value:[112, 113, 114, 115, 116, 117, 118, 119, 120]
1> key:1, value:[13, 14, 15, 16, 17, 18, 19, 20, 21]
1> key:2, value:[113, 114, 115, 116, 117, 118, 119, 120, 121]
1> key:1, value:[14, 15, 16, 17, 18, 19, 20, 21, 22]
1> key:2, value:[114, 115, 116, 117, 118, 119, 120, 121, 122]

可以看到,状态的数据只保留最近10s内的值

OnReadAndWriteReturnExpiredIfNotCleanedUp

复制代码
1> key:1, value:[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25]
1> key:2, value:[101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125]
1> key:1, value:[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26]
1> key:2, value:[101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126]
1> key:1, value:[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27]
1> key:2, value:[101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127]
1> key:1, value:[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28]
1> key:2, value:[101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128]
1> key:1, value:[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29]
1> key:2, value:[101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129]
1> key:1, value:[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30]
1> key:2, value:[101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130]
1> key:1, value:[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]
1> key:2, value:[101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131]
1> key:1, value:[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32]
1> key:2, value:[101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132]

可以看到,状态保留了所有的数据,因为每次都会读取了数据,所以不会过期

OnReadAndWriteNeverReturnExpired

复制代码
1> key:1, value:[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26]
1> key:2, value:[101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126]
1> key:1, value:[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27]
1> key:2, value:[101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127]
1> key:1, value:[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28]
1> key:2, value:[101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128]
1> key:1, value:[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29]
1> key:2, value:[101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129]
1> key:1, value:[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30]
1> key:2, value:[101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130]
1> key:1, value:[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]
1> key:2, value:[101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131]

可以看到,状态保留了所有的数据,因为每次都会读取了数据,所以不会过期

相关推荐
老蒋新思维4 小时前
创客匠人峰会深度解析:知识变现的 “信任 - 效率” 双闭环 —— 从 “单次交易” 到 “终身复购” 的增长密码
大数据·网络·人工智能·tcp/ip·重构·数据挖掘·创客匠人
EveryPossible6 小时前
优先级调整练习1
大数据·学习
B站计算机毕业设计之家7 小时前
基于大数据热门旅游景点数据分析可视化平台 数据大屏 Flask框架 Echarts可视化大屏
大数据·爬虫·python·机器学习·数据分析·spark·旅游
Jackeyzhe7 小时前
Flink学习笔记:如何做容错
flink
亿坊电商9 小时前
无人共享茶室智慧化破局:24H智能接单系统的架构实践与运营全景!
大数据·人工智能·架构
老蒋新思维9 小时前
创客匠人峰会新解:AI 时代知识变现的 “信任分层” 法则 —— 从流量到高客单的进阶密码
大数据·网络·人工智能·tcp/ip·重构·创始人ip·创客匠人
Jerry.张蒙9 小时前
SAP业财一体化实现的“隐形桥梁”-价值串
大数据·数据库·人工智能·学习·区块链·aigc·运维开发
一勺-_-10 小时前
.git文件夹
大数据·git·elasticsearch
秋刀鱼 ..11 小时前
2026年电力电子与电能变换国际学术会议 (ICPEPC 2026)
大数据·python·计算机网络·数学建模·制造
G皮T12 小时前
【Elasticsearch】 大慢查询隔离(一):最佳实践
大数据·elasticsearch·搜索引擎·性能调优·索引·性能·查询