37、Flink 的 WindowAssigner之会话窗口示例

1、处理时间

无需设置水位线和时间间隔。

bash 复制代码
input.keyBy(e -> e)
                .window(ProcessingTimeSessionWindows.withGap(Time.minutes(10)))
                .apply(new WindowFunction<String, String, String, TimeWindow>() {
                    @Override
                    public void apply(String s, TimeWindow timeWindow, Iterable<String> iterable, Collector<String> collector) throws Exception {
                        for (String string : iterable) {
                            collector.collect(string);
                        }
                    }
                })
                .print();

2、事件时间

需设置水位线和时间间隔。

bash 复制代码
// 事件时间需要设置水位线策略和时间戳
        SingleOutputStreamOperator<Tuple2<String, Long>> map = input.map(new MapFunction<String, Tuple2<String, Long>>() {
            @Override
            public Tuple2<String, Long> map(String input) throws Exception {
                String[] fields = input.split(",");
                return new Tuple2<>(fields[0], Long.parseLong(fields[1]));
            }
        });

        SingleOutputStreamOperator<Tuple2<String, Long>> watermarks = map.assignTimestampsAndWatermarks(WatermarkStrategy.<Tuple2<String, Long>>forBoundedOutOfOrderness(Duration.ofSeconds(0))
                .withTimestampAssigner(new SerializableTimestampAssigner<Tuple2<String, Long>>() {
                    @Override
                    public long extractTimestamp(Tuple2<String, Long> input, long l) {
                        return input.f1;
                    }
                }));

        // 设置了固定间隔的 event-time 会话窗口
        watermarks.keyBy(e -> e.f0)
                .window(EventTimeSessionWindows.withGap(Time.minutes(10)))
                .apply(new WindowFunction<Tuple2<String, Long>, String, String, TimeWindow>() {
                    @Override
                    public void apply(String s, TimeWindow timeWindow, Iterable<Tuple2<String, Long>> iterable, Collector<String> collector) throws Exception {
                        for (Tuple2<String, Long> stringLongTuple2 : iterable) {
                            collector.collect(stringLongTuple2.f0);
                        }
                    }
                })
                .print();

3、固定间隔和动态间隔

bash 复制代码
EventTimeSessionWindows.withGap(Time.minutes(10));

EventTimeSessionWindows.withDynamicGap(new SessionWindowTimeGapExtractor<Tuple2<String, Long>>() {
                    @Override
                    public long extract(Tuple2<String, Long> element) {
                        return element.f1 + 2000L;
                    }
                });

4、完整代码示例

bash 复制代码
import org.apache.flink.api.common.eventtime.SerializableTimestampAssigner;
import org.apache.flink.api.common.eventtime.WatermarkStrategy;
import org.apache.flink.api.common.functions.MapFunction;
import org.apache.flink.api.java.tuple.Tuple2;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.api.functions.windowing.WindowFunction;
import org.apache.flink.streaming.api.windowing.assigners.EventTimeSessionWindows;
import org.apache.flink.streaming.api.windowing.assigners.ProcessingTimeSessionWindows;
import org.apache.flink.streaming.api.windowing.assigners.SessionWindowTimeGapExtractor;
import org.apache.flink.streaming.api.windowing.time.Time;
import org.apache.flink.streaming.api.windowing.windows.TimeWindow;
import org.apache.flink.util.Collector;

import java.time.Duration;

public class _04_WindowAssignerSession {
    public static void main(String[] args) throws Exception {
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();

        DataStreamSource<String> input = env.socketTextStream("localhost", 8888);

        // 测试时限制了分区数,生产中需要设置空闲数据源
        env.setParallelism(2);

        // 事件时间需要设置水位线策略和时间戳
        SingleOutputStreamOperator<Tuple2<String, Long>> map = input.map(new MapFunction<String, Tuple2<String, Long>>() {
            @Override
            public Tuple2<String, Long> map(String input) throws Exception {
                String[] fields = input.split(",");
                return new Tuple2<>(fields[0], Long.parseLong(fields[1]));
            }
        });

        SingleOutputStreamOperator<Tuple2<String, Long>> watermarks = map.assignTimestampsAndWatermarks(WatermarkStrategy.<Tuple2<String, Long>>forBoundedOutOfOrderness(Duration.ofSeconds(0))
                .withTimestampAssigner(new SerializableTimestampAssigner<Tuple2<String, Long>>() {
                    @Override
                    public long extractTimestamp(Tuple2<String, Long> input, long l) {
                        return input.f1;
                    }
                }));

        // 设置了固定间隔的 event-time 会话窗口
        watermarks.keyBy(e -> e.f0)
                .window(EventTimeSessionWindows.withGap(Time.minutes(10)))
                .apply(new WindowFunction<Tuple2<String, Long>, String, String, TimeWindow>() {
                    @Override
                    public void apply(String s, TimeWindow timeWindow, Iterable<Tuple2<String, Long>> iterable, Collector<String> collector) throws Exception {
                        for (Tuple2<String, Long> stringLongTuple2 : iterable) {
                            collector.collect(stringLongTuple2.f0);
                        }
                    }
                })
                .print();

        // 设置了动态间隔的 event-time 会话窗口
        watermarks.keyBy(e -> e.f0)
                .window(EventTimeSessionWindows.withDynamicGap(new SessionWindowTimeGapExtractor<Tuple2<String, Long>>() {
                    @Override
                    public long extract(Tuple2<String, Long> element) {
                        return element.f1 + 2000L;
                    }
                }))
                .apply(new WindowFunction<Tuple2<String, Long>, String, String, TimeWindow>() {
                    @Override
                    public void apply(String s, TimeWindow timeWindow, Iterable<Tuple2<String, Long>> iterable, Collector<String> collector) throws Exception {
                        for (Tuple2<String, Long> stringLongTuple2 : iterable) {
                            collector.collect(stringLongTuple2.f0);
                        }
                    }
                })
                .print();

        // 设置了固定间隔的 processing-time session 窗口
        input.keyBy(e -> e)
                .window(ProcessingTimeSessionWindows.withGap(Time.minutes(10)))
                .apply(new WindowFunction<String, String, String, TimeWindow>() {
                    @Override
                    public void apply(String s, TimeWindow timeWindow, Iterable<String> iterable, Collector<String> collector) throws Exception {
                        for (String string : iterable) {
                            collector.collect(string);
                        }
                    }
                })
                .print();

        // 设置了动态间隔的 processing-time 会话窗口
        input.keyBy(e -> e)
                .window(ProcessingTimeSessionWindows.withDynamicGap(new SessionWindowTimeGapExtractor<String>() {
                    @Override
                    public long extract(String s) {
                        return System.currentTimeMillis() / 1000;
                    }
                }))
                .apply(new WindowFunction<String, String, String, TimeWindow>() {
                    @Override
                    public void apply(String s, TimeWindow timeWindow, Iterable<String> iterable, Collector<String> collector) throws Exception {
                        for (String string : iterable) {
                            collector.collect(string);
                        }
                    }
                })
                .print();

        env.execute();
    }
}
相关推荐
珠海西格电力4 小时前
零碳园区全面感知体系的建设成本和收益分析包含哪些关键数据?
大数据·数据库·人工智能·智慧城市·能源
QYR_114 小时前
光模块行业全景解析:全球市场规模、格局分布及发展趋势(2026-2032)
大数据·人工智能
九硕智慧建筑一体化厂家5 小时前
什么是楼宇自控?全面解析楼宇自控与楼宇自控系统的作用
大数据·运维·人工智能·网络协议·制造
灰化肥发挥6 小时前
韩国草药制剂数据查询:如何获取MFDS注册数据与韩国药典标准?
大数据·人工智能·医药数据库
小王毕业啦6 小时前
2010-2023年 地级市-破产法庭设立数据(+文献)
大数据·人工智能·数据挖掘·数据分析·社科数据·经管数据·破产法庭
雷焰财经6 小时前
从系统承建到生态赋能:宇信科技全球化战略的纵深与逻辑
大数据·人工智能·科技
智慧化智能化数字化方案7 小时前
数据资产管理——解读数据资产管理制度_高清版【附全文阅读】
大数据·数据资产管理制度
焦糖玛奇朵婷7 小时前
盲盒小程序一站式开发
java·大数据·服务器·前端·小程序
九河云7 小时前
零售企业云转型:全渠道融合背后的云基础设施支撑
大数据·微服务·重构·产品运营·零售·数字化转型
Elastic 中国社区官方博客7 小时前
Elasticsearch Serverless 的无状态架构
大数据·数据库·elasticsearch·搜索引擎·云原生·架构·serverless