FlinkSQL一篇文档讲清楚

Overview | Apache Flink

FlinkSQL开发步骤

Concepts & Common API | Apache Flink

添加依赖:

复制代码
<dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-table-api-java-bridge</artifactId>
            <version>${flink.version}</version>
</dependency>

这里的依赖是一个Java的"桥接器"(bridge),主要就是负责Table API和下层DataStream API的连接支持,按照不同的语言分为Java版和Scala版。

如果我们希望在本地的集成开发环境(IDE)里运行Table API和SQL,还需要引入以下依赖:

XML 复制代码
<dependency>
    <groupId>org.apache.flink</groupId>
    <artifactId>flink-table-planner-loader</artifactId>
    <version>${flink.version}</version>
</dependency>

<dependency>
    <groupId>org.apache.flink</groupId>
    <artifactId>flink-table-runtime</artifactId>
    <version>${flink.version}</version>
</dependency>

<dependency>
    <groupId>org.apache.flink</groupId>
    <artifactId>flink-connector-files</artifactId>
    <version>${flink.version}</version>
</dependency>

第一步:创建表环境

对于Flink这样的流处理框架来说,数据流和表在结构上还是有所区别的。所以使用Table API和SQL需要一个特别的运行时环境,这就是所谓的"表环境"(TableEnvironment)。

对于流处理场景,其实默认配置就完全够用了。所以我们也可以用另一种更加简单的方式来创建表环境

java 复制代码
package com.bigdata;

import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.table.api.EnvironmentSettings;
import org.apache.flink.table.api.TableEnvironment;
import org.apache.flink.table.api.bridge.java.StreamTableEnvironment;

public class _001Demo {

    public static void main(String[] args) {

        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        StreamTableEnvironment tableEnv2 = StreamTableEnvironment.create(env);
        System.out.println(tableEnv2);
    }
}

第二步:创建表

表(Table)是我们非常熟悉的一个概念,它是关系型数据库中数据存储的基本形式,也是SQL执行的基本对象。

java 复制代码
DataStreamSource<String> dataStreamSource = env.socketTextStream("localhost", 8899);
 // 将这个流对象变为一个表
 tableEnv.createTemporaryView("table1", dataStreamSource);

第三步:表的查询

创建好了表,接下来自然就是对表进行查询转换了。对一个表的查询(Query)操作,就对应着流数据的转换(Transform)处理。

复制代码
//TableResult tableResult = tableEnv.executeSql("select * from table1");
//tableResult.print();
 Table table = tableEnv.sqlQuery("select * from table1");

第一个flinksql案例

java 复制代码
package com.bigdata.day06;

import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.table.api.Table;
import org.apache.flink.table.api.TableResult;
import org.apache.flink.table.api.bridge.java.StreamTableEnvironment;
import org.apache.flink.types.Row;

/**
 * @基本功能:
 * @program:FlinkDemo
 * @author: 闫哥
 * @create:2025-12-02 10:11:04
 **/
public class Demo01 {

    public static void main(String[] args) throws Exception {

        //1. env-准备环境
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        // 获取表环境
        StreamTableEnvironment tableEnv = StreamTableEnvironment.create(env);

        DataStreamSource<String> dataStreamSource = env.socketTextStream("localhost", 8888);
        tableEnv.createTemporaryView("table1",dataStreamSource);

        /*TableResult tableResult = tableEnv.executeSql("select * from table1");
        tableResult.print();*/
        Table table = tableEnv.sqlQuery("select * from table1");
        table.printSchema();// 打印表结构
        // table 中的数据是无法直接打印的,要想得到里面的数据,需要将table对象变为流对象
        // toAppendStream 已经淘汰啦
        //DataStream<Row> appendStream = tableEnv.toAppendStream(table, Row.class);
        // 新的api 和toAppendStream 作用一样
        DataStream<Row> dataStream = tableEnv.toDataStream(table, Row.class);

        dataStream.print();


        env.execute();
    }
}

因为DataStream中是Row 类型,所以打印的格式是Row 这个类中的toString方法决定的。这个地方的 +I 的意思是新增的数据。

第四步:输出表

表的创建和查询,就对应着流处理中的读取数据源(Source)和转换(Transform);而最后一个步骤Sink,也就是将结果数据输出到外部系统,就对应着表的输出操作。

在代码上,输出一张表最直接的方法,就是调用Table的方法executeInsert()方法将一个 Table写入到注册过的表中,方法传入的参数就是注册的表名。

复制代码
// 注册表,用于输出数据到外部系统
tableEnv.executeSql("CREATE TABLE OutputTable ... WITH ( 'connector' = ... )");

// 经过查询转换,得到结果表
Table result = ...

// 将结果表写入已注册的输出表中
result.executeInsert("OutputTable");

在底层,表的输出是通过将数据写入到TableSink来实现的。TableSink是Table API中提供的一个向外部系统写入数据的通用接口,可以支持不同的文件格式(比如CSV、Parquet)、存储数据库(比如JDBC、Elasticsearch)和消息队列(比如Kafka)。

第五步: 表和流的转换

1 )将流( DataStream )转换成表( Table

(1)调用fromDataStream()方法

想要将一个DataStream转换成表很简单,可以通过调用表环境的fromDataStream()方法来实现,返回的就是一个Table对象。

复制代码
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();

// 获取表环境
StreamTableEnvironment tableEnv = StreamTableEnvironment.create(env);

// 读取数据源
SingleOutputStreamOperator<WaterSensor> sensorDS = env.fromSource(...)

// 将数据流转换成表
Table sensorTable = tableEnv.fromDataStream(sensorDS);

由于流中的数据本身就是定义好的POJO类型WaterSensor,所以我们将流转换成表之后,每一行数据就对应着一个WaterSensor,而表中的列名就对应着WaterSensor中的属性。

另外,我们还可以在fromDataStream()方法中增加参数,用来指定提取哪些属性作为表中的字段名,并可以任意指定位置:

复制代码
// 提取Event中的timestamp和url作为表中的列
Table sensorTable = tableEnv.fromDataStream(sensorDS, $("id"), $("vc"));

也可以通过表达式的as()方法对字段进行重命名:

复制代码
// 将timestamp字段重命名为ts
Table sensorTable = tableEnv.fromDataStream(sensorDS, $("id").as("sid"), $("vc"));

(2)调用createTemporaryView()方法

调用fromDataStream()方法简单直观,可以直接实现DataStream到Table的转换;不过如果我们希望直接在SQL中引用这张表,就还需要调用表环境的createTemporaryView()方法来创建虚拟视图了。

对于这种场景,也有一种更简洁的调用方式。我们可以直接调用createTemporaryView()方法创建虚拟表,传入的两个参数,第一个依然是注册的表名,而第二个可以直接就是DataStream。之后仍旧可以传入多个参数,用来指定表中的字段

复制代码
tableEnv.createTemporaryView("sensorTable",sensorDS, $("id"),$("ts"),$("vc"));

这样,我们接下来就可以直接在SQL中引用表sensorTable了。

2 )将表( Table )转换成流( DataStream

(1)调用toDataStream()方法

将一个Table对象转换成DataStream非常简单,只要直接调用表环境的方法toDataStream()就可以了。例如,我们可以将2.4小节经查询转换得到的表aliceClickTable转换成流打印输出:

复制代码
tableEnv.toDataStream(table).print();

(2)调用toChangelogStream()方法

表中进行了分组聚合统计,所以表中的每一行是会"更新"的。对于这样有更新操作的表,我们不应该直接把它转换成DataStream打印输出,而是记录一下它的"更新日志"(change log)。这样一来,对于表的所有更新操作,就变成了一条更新日志的流,我们就可以转换成流打印输出了。

代码中需要调用的是表环境的toChangelogStream()方法:

复制代码
Table table = tableEnv.sqlQuery(
"SELECT id, sum(vc) " +
"FROM source " +
"GROUP BY id "
);

// 将表转换成更新日志流
tableEnv.toChangelogStream(table).print();

假如你的 sql 语句中带有分组,table 对象,要使用toChangelogStream 将其变为流,而不能使用toDataStream

3 )支持的数据类型

整体来看,DataStream中支持的数据类型,Table中也是都支持的,只不过在进行转换时需要注意一些细节。

(1)原子类型

在Flink中,基础数据类型(Integer、Double、String)和通用数据类型(也就是不可再拆分的数据类型)统一称作"原子类型"。原子类型的DataStream,转换之后就成了只有一列的Table,列字段(field)的数据类型可以由原子类型推断出。另外,还可以在fromDataStream()方法里增加参数,用来重新命名列字段。

复制代码
StreamTableEnvironment tableEnv = ...;

DataStream<Long> stream = ...;

// 将数据流转换成动态表,动态表只有一个字段,重命名为myLong
Table table = tableEnv.fromDataStream(stream, $("myLong"));

(2)Tuple 类型

当原子类型不做重命名时,默认的字段名就是"f0",容易想到,这其实就是将原子类型看作了一元组Tuple1的处理结果。

Table支持Flink中定义的元组类型Tuple,对应在表中字段名默认就是元组中元素的属性名f0、f1、f2...。所有字段都可以被重新排序,也可以提取其中的一部分字段。字段还可以通过调用表达式的as()方法来进行重命名。

复制代码
StreamTableEnvironment tableEnv = ...;

DataStream<Tuple2<Long, Integer>> stream = ...;

// 将数据流转换成只包含f1字段的表
Table table = tableEnv.fromDataStream(stream, $("f1"));

// 将数据流转换成包含f0和f1字段的表,在表中f0和f1位置交换
Table table = tableEnv.fromDataStream(stream, $("f1"), $("f0"));

// 将f1字段命名为myInt,f0命名为myLong
Table table = tableEnv.fromDataStream(stream, $("f1").as("myInt"), $("f0").as("myLong"));

(3)POJO 类型

Flink也支持多种数据类型组合成的"复合类型",最典型的就是简单Java对象(POJO 类型)。由于POJO中已经定义好了可读性强的字段名,这种类型的数据流转换成Table就显得无比顺畅了。

将POJO类型的DataStream转换成Table,如果不指定字段名称,就会直接使用原始 POJO 类型中的字段名称。POJO中的字段同样可以被重新排序、提却和重命名。

复制代码
StreamTableEnvironment tableEnv = ...;

DataStream<Event> stream = ...;

Table table = tableEnv.fromDataStream(stream);
Table table = tableEnv.fromDataStream(stream, $("user"));
Table table = tableEnv.fromDataStream(stream, $("user").as("myUser"), $("url").as("myUrl"));

(4)Row类型

Flink中还定义了一个在关系型表中更加通用的数据类型------行(Row),它是Table中数据的基本组织形式

DataStream -> 表

第一种方案:

第二种方案:

表->DataStream

以上的 toAppendStream 已经变为了 toDataStream

toRetractStream 已经为变为了 toChangelogStream

综合应用示例

现在,我们可以将介绍过的所有API整合起来,写出一段完整的代码。同样还是用户的一组点击事件,我们可以查询出某个用户(例如Alice)点击的url列表,也可以统计出每个用户累计的点击次数,这可以用两句SQL来分别实现。具体代码如下:

java 复制代码
package com.bigdata;


import lombok.AllArgsConstructor;
import lombok.Data;
import lombok.NoArgsConstructor;

@Data
@NoArgsConstructor
@AllArgsConstructor
public class WaterSensor {

    private String id;
    private Long ts;
    private int vc;
}
java 复制代码
package com.bigdata.flinksql;

import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.table.api.Table;
import org.apache.flink.table.api.bridge.java.StreamTableEnvironment;
import org.apache.flink.types.Row;

public class _002Demo {


    public static void main(String[] args) throws Exception {
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        // 获取tableEnv
        StreamTableEnvironment tableEnv = StreamTableEnvironment.create(env);

        DataStreamSource<WaterSensor> sensorDS = env.fromElements(
                new WaterSensor("s1", 1L, 1),
                new WaterSensor("s1", 2L, 2),
                new WaterSensor("s2", 2L, 2),
                new WaterSensor("s3", 3L, 3),
                new WaterSensor("s3", 4L, 4)
        );
        // 第一个需求,ts > 2 的用户数据
        tableEnv.createTemporaryView("sensor", sensorDS);
        Table table = tableEnv.sqlQuery("select id,ts,vc from sensor where ts>2");
        DataStream<WaterSensor> dataStream = tableEnv.toDataStream(table, WaterSensor.class);
        dataStream.print();
        // 统计每一个用户的 总的浏览量
        // 也可以使用别的方法,将一个流变为一个表
        Table sensorTable = tableEnv.fromDataStream(sensorDS);
        tableEnv.createTemporaryView("sensor2", sensorTable);

        Table table2 = tableEnv.sqlQuery("select id,sum(vc) from sensor2 group by id");
        DataStream<Row> changelogStream = tableEnv.toChangelogStream(table2);
        changelogStream.print();

        env.execute();
    }
}

因为我们经常编写flinksql,所以创建一个模板:

java 复制代码
#if (${PACKAGE_NAME} && ${PACKAGE_NAME} != "")package ${PACKAGE_NAME};#end
#parse("File Header.java")
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.table.api.bridge.java.StreamTableEnvironment;                          
/**
 @基本功能:
 @program:${PROJECT_NAME}
 @author: 闫哥
 @create:${YEAR}-${MONTH}-${DAY} ${HOUR}:${MINUTE}:${SECOND}
**/
public class ${NAME} {

    public static void main(String[] args) throws Exception {

    //1. env-准备环境
    StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
    env.setRuntimeMode(RuntimeExecutionMode.AUTOMATIC);
    // 获取tableEnv对象
   // 通过env 获取一个table 环境
   StreamTableEnvironment tableEnv = StreamTableEnvironment.create(env);
    
    //2. 创建表对象
    //3. 编写sql语句
    //4. 将Table变为stream流
   
    
    
   
    
     //5. execute-执行
    env.execute();
  }
}

单词统计案例

根据这个可以做一个单词统计的案例:

java 复制代码
package com.bigdata.flinksql;

import org.apache.flink.api.common.RuntimeExecutionMode;
import org.apache.flink.api.common.functions.FlatMapFunction;
import org.apache.flink.api.java.tuple.Tuple2;
import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.table.api.Table;
import org.apache.flink.table.api.bridge.java.StreamTableEnvironment;
import org.apache.flink.types.Row;
import org.apache.flink.util.Collector;

import static org.apache.flink.table.api.Expressions.$;

/**
 * @基本功能:
 * @program:FlinkDemo2
 * @author: 闫哥
 * @create:2025-04-19 17:06:52
 **/
public class _003FlinkSQLWordCount {

    public static void main(String[] args) throws Exception {

        //1. env-准备环境
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        // 获取tableEnv对象
        // 通过env 获取一个table 环境
        StreamTableEnvironment tEnv = StreamTableEnvironment.create(env);

        //2. 创建表对象
        DataStreamSource<String> dataStreamSource = env.socketTextStream("localhost", 8899);
        // hello hello word abc hello
        SingleOutputStreamOperator<Tuple2<String, Integer>> flatMapStream = dataStreamSource.flatMap(new FlatMapFunction<String, Tuple2<String, Integer>>() {
            @Override
            public void flatMap(String line, Collector<Tuple2<String, Integer>> collector) throws Exception {
                String[] arr = line.split("\\s+");
                for (String word : arr) {
                    collector.collect(Tuple2.of(word, 1));
                }
            }
        });
        tEnv.createTemporaryView("wordcount", flatMapStream,$("word"),$("num"));

        Table table = tEnv.sqlQuery("select word,sum(num) as cnt from wordcount group by word");
        //3. 编写sql语句
        //4. 将Table变为stream流
        DataStream<Row> changelogStream = tEnv.toChangelogStream(table);
        changelogStream.print();

        //tEnv.toDataStream(table).print();


        //5. execute-执行
        env.execute();
    }
}

测试输入:hello world spark hello world spark laoyan

复制代码
Exception in thread "main" org.apache.flink.table.api.TableException: Table sink '*anonymous_datastream_sink$2*' doesn't support consuming update changes which is produced by node GroupAggregate(groupBy=[word], select=[word, SUM(num) AS cnt])
	at org.apache.flink.table.planner.plan.optimize.program.FlinkChangelogModeInferenceProgram$SatisfyModifyKindSetTraitVisitor.createNewNode(FlinkChangelogModeInferenceProgram.scala:405)
	at org.apache.flink.table.planner.plan.optimize.program.FlinkChangelogModeInferenceProgram$SatisfyModifyKindSetTraitVisitor.visit(FlinkChangelogModeInferenceProgram.scala:185)
	at org.apache.flink.table.planner.plan.optimize.program.FlinkChangelogModeInferenceProgram$SatisfyModifyKindSetTraitVisitor.visitChild(FlinkChangelogModeInferenceProgram.scala:366)
	at org.apache.flink.table.planner.plan.optimize.program.FlinkChangelogModeInferenceProgram$SatisfyModifyKindSetTraitVisitor.$anonfun$visitChildren$1(FlinkChangelogModeInferenceProgram.scala:355)
	at org.apache.flink.table.planner.plan.optimize.program.FlinkChangelogModeInferenceProgram$SatisfyModifyKindSetTraitVisitor.$anonfun$visitChildren$1$adapted(FlinkChangelogModeInferenceProgram.scala:354)
	at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:233)

解决方案:

//DataStream<Row> resultStream = tEnv.toDataStream(table);

修改为:

tEnv.toChangelogStream(table);

  • I 表示新增

-U 更新前

+U 更新后

FlinkSQL-API

**需求:**使用SQL对DataStream中的单词进行统计

java 复制代码
package com.bigdata.sql;

import lombok.AllArgsConstructor;
import lombok.Data;
import lombok.NoArgsConstructor;

@Data
@AllArgsConstructor
@NoArgsConstructor
public class WC{
    private String word;
    private int num;
}
java 复制代码
package com.bigdata.day06;

import org.apache.flink.api.common.functions.FlatMapFunction;
import org.apache.flink.api.common.functions.MapFunction;
import org.apache.flink.api.java.tuple.Tuple2;
import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.table.api.DataTypes;
import org.apache.flink.table.api.Schema;
import org.apache.flink.table.api.Table;
import org.apache.flink.table.api.bridge.java.StreamTableEnvironment;
import org.apache.flink.types.Row;
import org.apache.flink.util.Collector;

import static org.apache.flink.table.api.Expressions.$;

/**
 * @基本功能:
 * @program:FlinkDemo
 * @author: 闫哥
 * @create:2025-12-02 11:02:46
 **/
public class Demo04 {

    public static void main(String[] args) throws Exception {

        //1. env-准备环境
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();

        // 获取tableEnv对象
        // 通过env 获取一个table 环境
        StreamTableEnvironment tableEnv = StreamTableEnvironment.create(env);

        DataStreamSource<String> dataStreamSource = env.socketTextStream("localhost", 8888);
        // hello abc hello spark
        SingleOutputStreamOperator<Tuple2<String, Integer>> flatMap = dataStreamSource.flatMap(new FlatMapFunction<String, Tuple2<String, Integer>>() {
            @Override
            public void flatMap(String line, Collector<Tuple2<String, Integer>> out) throws Exception {
                String[] arr = line.split("\\s+");
                for (String word : arr) {
                    out.collect(Tuple2.of(word, 1));
                }
            }
        });

        Table table = tableEnv.fromDataStream(flatMap,$("word"),$("num"));
        // DSL的语法
        Table resultTable = table.groupBy($("word")).select($("word"), $("num").sum().as("sumNum"));
        // resultTable.printSchema();


        // 创建一个schema ,发现返回值依然是 Row对象,所以没什么特别大的效果,以后不写了
        /*Schema schema = Schema.newBuilder()
                .column("word", DataTypes.STRING())
                .column("sumNum", DataTypes.INT())
                .build();*/

        //DataStream<Row> changelogStream = tableEnv.toChangelogStream(resultTable,schema);
        DataStream<Row> changelogStream = tableEnv.toChangelogStream(resultTable);
        //changelogStream.print();

        changelogStream.map(new MapFunction<Row, WC>() {
            @Override
            public WC map(Row row) throws Exception {

                String word = (String) row.getField("word");
                int sumNum = (int) row.getField("sumNum");

                return new WC(word,sumNum);
            }
        }).print();


        env.execute();
    }
}
复制代码
Exception in thread "main" org.apache.flink.table.api.ValidationException: Too many fields referenced from an atomic type.
	at org.apache.flink.table.typeutils.FieldInfoUtils.extractFieldInfoFromAtomicType(FieldInfoUtils.java:473)

假如遇到以上错误,请检查wc实体类中,是否有  无参构造方法。

全新的方式(必须掌握)-- connector 连接方式

DataStream需要变为Table,计算结果还要变为DataStream,过于麻烦了。

以下内容是FlinkSQL的全新的方式,更加简单高效。

datagen + print

java 复制代码
package com.bigdata.flinksql.day02;

import org.apache.flink.api.common.RuntimeExecutionMode;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.table.api.TableResult;
import org.apache.flink.table.api.bridge.java.StreamTableEnvironment;

/**
 * @基本功能:
 * @program:FlinkDemo2
 * @author: 闫哥
 * @create:2025-04-20 09:29:01
 *
 * 该案例主要用于测试flink中的connector 的用法,无特殊的实战意义
 *
 **/
public class _001Demo {

    public static void main(String[] args) throws Exception {

        //1. env-准备环境
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        env.setRuntimeMode(RuntimeExecutionMode.STREAMING);
        // 获取tableEnv对象
        // 通过env 获取一个table 环境
        StreamTableEnvironment tEnv = StreamTableEnvironment.create(env);

        //2. 创建表对象
        tEnv.executeSql("CREATE TABLE source ( \n" +
                "    id INT, \n" +
                "    ts BIGINT, \n" +
                "    vc INT\n" +
                ") WITH ( \n" +
                "    'connector' = 'datagen', \n" +
                "    'rows-per-second'='1', \n" +
                "    'fields.id.kind'='random', \n" +
                "    'fields.id.min'='1', \n" +
                "    'fields.id.max'='10', \n" +
                "    'fields.ts.kind'='sequence', \n" +
                "    'fields.ts.start'='1', \n" +
                "    'fields.ts.end'='1000000', \n" +
                "    'fields.vc.kind'='random', \n" +
                "    'fields.vc.min'='1', \n" +
                "    'fields.vc.max'='100'\n" +
                ");\n");
        //3. 在创建一个表
        tEnv.executeSql("CREATE TABLE sink (\n" +
                "    id INT, \n" +
                "    sumVC INT \n" +
                ") WITH (\n" +
                "'connector' = 'print'\n" +
                ");\n");
        //4. 将Table变为stream流
        tEnv.executeSql("insert into sink select id,sum(vc) sumVC from source group by id");

        // 本身想着将数据写入到了sink 中,需要查询查看结果,但是此处的sink 是print,所以会直接打印到控制台,所以如果你书写如下代码,会报错
        // Connector 'print' can only be used as a sink. It cannot be used as a source.
        /*TableResult tableResult = tEnv.executeSql("select * from sink");
        tableResult.print();*/


        //5. execute-执行
        // No operators defined in streaming topology. Cannot execute.
        // env.execute();
    }
}

FlinkSQL-kafka

**需求:**从Kafka的topic1中消费数据并过滤出状态为success的数据再写入到Kafka的topic2

先创建两个 topic

复制代码
{"user_id": "1", "page_id":"1", "status": "success"}
{"user_id": "1", "page_id":"1", "status": "success"}
{"user_id": "1", "page_id":"1", "status": "success"}
{"user_id": "1", "page_id":"1", "status": "success"}
{"user_id": "1", "page_id":"1", "status": "fail"}

代码实现

https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/connectors/table/kafka/

复制代码
<dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-connector-kafka</artifactId>
            <version>${flink.version}</version>
</dependency>

{"user_id": "1", "page_id":"1", "status": "success"}

CREATE TABLE table1 (
  `user_id` int,
  `page_id` int,
  `status` STRING
) WITH (
  'connector' = 'kafka',
  'topic' = 'topic1',
  'properties.bootstrap.servers' = 'localhost:9092',
  'properties.group.id' = 'g1',
  'scan.startup.mode' = 'latest-offset',
  'format' = 'json'
)

CREATE TABLE table2 (
  `user_id` int,
  `page_id` int,
  `status` STRING
) WITH (
  'connector' = 'kafka',
  'topic' = 'topic2',
  'properties.bootstrap.servers' = 'localhost:9092',
  'properties.group.id' = 'g1',
  'scan.startup.mode' = 'latest-offset',
  'format' = 'json'
)

编写sql即可。
insert into table2  select * from table1 where status='success'
java 复制代码
package com.bigdata.flinksql.day02;

import org.apache.flink.api.common.RuntimeExecutionMode;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.table.api.bridge.java.StreamTableEnvironment;

/**
 * @基本功能:
 * @program:FlinkDemo2
 * @author: 闫哥
 * @create:2025-04-20 09:49:50
 **/
public class _002Demo {

    public static void main(String[] args) throws Exception {

        //1. env-准备环境
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        // 获取tableEnv对象
        // 通过env 获取一个table 环境
        StreamTableEnvironment tEnv = StreamTableEnvironment.create(env);

        tEnv.executeSql("CREATE TABLE table1 (\n" +
                "  `user_id` int,\n" +
                "  `page_id` int,\n" +
                "  `status` STRING\n" +
                ") WITH (\n" +
                "  'connector' = 'kafka',\n" +
                "  'topic' = 'topic1',\n" +
                "  'properties.bootstrap.servers' = 'node01:9092',\n" +
                "  'properties.group.id' = 'g1',\n" +
                "  'scan.startup.mode' = 'latest-offset',\n" +
                "  'format' = 'json'\n" +
                ")");

        tEnv.executeSql("CREATE TABLE table2 (\n" +
                "  `user_id` int,\n" +
                "  `page_id` int,\n" +
                "  `status` STRING\n" +
                ") WITH (\n" +
                "  'connector' = 'kafka',\n" +
                "  'topic' = 'topic2',\n" +
                "  'properties.bootstrap.servers' = 'node01:9092',\n" +
                "  'properties.group.id' = 'g1',\n" +
                "  'scan.startup.mode' = 'latest-offset',\n" +
                "  'format' = 'json'\n" +
                ")");

        tEnv.executeSql(" insert into table2 select * from table1 where status='success'");

    }
}

注意:如果最后一句写了:

复制代码
env.execute();

而且以上写法可以将最后两步简化为:
tenv.executeSql("insert into table2 select * from table1 where status = 'success' ")

其实会进行报错,这句话其实已经没有用处了,但是这个错误不影响最终的结果。

将最后一句env.execute(); 删除即可。

假如报以下错误:说明没有导入json的包:

复制代码
Caused by: org.apache.flink.table.api.ValidationException: Could not find any factory for identifier 'json' that implements 'org.apache.flink.table.factories.DeserializationFormatFactory' in the classpath.

Available factory identifiers are:

raw
	at org.apache.flink.table.factories.FactoryUtil.discoverFactory(FactoryUtil.java:319)
	at org.apache.flink.table.factories.FactoryUtil$TableFactoryHelper.discoverOptionalFormatFactory(FactoryUtil.java:751)

需要导入这个包:

复制代码
<dependency>
  <groupId>org.apache.flink</groupId>
  <artifactId>flink-json</artifactId>
  <version>1.17.2</version>
</dependency>

这个jar包可以在此处找到:

测试一下:

先启动生产者,

再启动消费者:

从上面可以看到,生产者向topic1中发送数据,topic2中只有success数据。

FlinkSQL-kafka To mysql

**需求:**从Kafka的topic1中消费数据并过滤出状态为success的数据再写入到MySQL

复制代码
{"user_id": "1", "page_id":"1", "status": "success"}
{"user_id": "1", "page_id":"1", "status": "success"}
{"user_id": "1", "page_id":"1", "status": "success"}
{"user_id": "1", "page_id":"1", "status": "success"}
{"user_id": "1", "page_id":"1", "status": "fail"}

在mysql中需要有一个数据库 demo ,需要由一个表 t_success

复制代码
create table t_success
(
    id      int auto_increment,
    user_id int         null,
    page_id int         null,
    status  varchar(20) null,
    constraint t_success_pk
        primary key (id)
);

官方示例代码如下:

复制代码
CREATE TABLE MyUserTable (
  id BIGINT,
  name STRING,
  age INT,
  status BOOLEAN,
  PRIMARY KEY (id) NOT ENFORCED
) WITH (
   'connector' = 'jdbc',
   'url' = 'jdbc:mysql://localhost:3306/spark_demo',
   'table-name' = 't_success',
    'username' = 'root',
    'password' = '123456'

);

代码实现

java 复制代码
package com.bigdata.flinksql.day02;

import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.table.api.bridge.java.StreamTableEnvironment;

/**
 * @基本功能:
 * @program:FlinkDemo2
 * @author: 闫哥
 * @create:2025-04-20 09:49:50
 **/
public class _003Demo {

    public static void main(String[] args) throws Exception {

        //1. env-准备环境
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        // 获取tableEnv对象
        // 通过env 获取一个table 环境
        StreamTableEnvironment tEnv = StreamTableEnvironment.create(env);

        tEnv.executeSql("CREATE TABLE table1 (\n" +
                "  `user_id` int,\n" +
                "  `page_id` int,\n" +
                "  `status` STRING\n" +
                ") WITH (\n" +
                "  'connector' = 'kafka',\n" +
                "  'topic' = 'topic1',\n" +
                "  'properties.bootstrap.servers' = 'node01:9092',\n" +
                "  'properties.group.id' = 'g1',\n" +
                "  'scan.startup.mode' = 'latest-offset',\n" +
                "  'format' = 'json'\n" +
                ")");

        tEnv.executeSql("CREATE TABLE table2 (\n" +
                "  `user_id` int,\n" +
                "  `page_id` int,\n" +
                "  `status` STRING\n" +
                ") WITH (\n" +
                "    'connector' = 'jdbc',\n" +
                "    'url' = 'jdbc:mysql://node01:3306/gongcheng?useUnicode=true&characterEncoding=utf8',\n" +
                "    'table-name' = 't_success', \n" +
                "    'username' = 'root',\n" +
                "    'password' = '123456'\n" +
                ")");

        tEnv.executeSql(" insert into table2 select * from table1 where status='success'");

    }
}

FlinkSQL 读取 mysql 数据的案例代码:

java 复制代码
package com.bigdata.flinksql.day02;

import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.table.api.TableResult;
import org.apache.flink.table.api.bridge.java.StreamTableEnvironment;

/**
 * @基本功能:
 * @program:FlinkDemo2
 * @author: 闫哥
 * @create:2025-04-20 09:49:50
 *
 *  flinksql 读取mysql的数据
 **/
public class _004Demo {

    public static void main(String[] args) throws Exception {

        //1. env-准备环境
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        // 获取tableEnv对象
        // 通过env 获取一个table 环境
        StreamTableEnvironment tEnv = StreamTableEnvironment.create(env);

        tEnv.executeSql("CREATE TABLE table1 (\n" +
                "  `user_id` int,\n" +
                "  `page_id` int,\n" +
                "  `status` STRING\n" +
                ") WITH (\n" +
                "    'connector' = 'jdbc',\n" +
                "    'url' = 'jdbc:mysql://node01:3306/gongcheng?useUnicode=true&characterEncoding=utf8',\n" +
                "    'table-name' = 't_success', \n" +
                "    'username' = 'root',\n" +
                "    'password' = '123456'\n" +
                ")");

        // 第一种打印的方式
        //TableResult tableResult = tEnv.executeSql("select * from table1 where status='success'");
        //tableResult.print();

        // 第二种打印的方式
        tEnv.executeSql("CREATE TABLE table2 (\n" +
                "    user_id INT, \n" +
                "  `status` STRING\n" +
                ") WITH (\n" +
                "'connector' = 'print'\n" +
                ");\n");
        tEnv.executeSql("insert into table2 select user_id,status from table1 where status='success'");

    }
}

二、flinksql的窗口

复制代码
关于窗口
  窗口的两个属性:窗口长度、窗口间隔
  1. 从窗口长度和窗口间隔的角度可以将窗口分为:滑动窗口和滚动窗口
  窗口长度= 窗口间隔 (滚动窗口)
  窗口长度> 窗口间隔 (滑动窗口)
  2. flink的时间语义:EventTime和ProcessingTime
  EventTime: 事件发生的时间
  ProcessingTime: flink程序处理的时间
  
  结合语义窗口可以划分为:基于eventTime的滚动和滑动窗口、基于ProcessingTime的滚动和滑动窗口。
  
  基于ProcessingTime的窗口,它的窗口触发条件是 :系统时间大于等于窗口的结束时间,且窗口内有数据,则触发窗口中的数据计算。
  基于EventTime的窗口,它的窗口触发条件是:watermark(水印)大于等于窗口的结束时间,且窗口内有数据,则触发窗口中的数据计算。

  watermark是什么? watermark其实就是给数据额外添加的一列时间戳,watermark = maxEventTime - 设置允许延迟时间。

1.1 滚动窗口 Tumble

1.2 滑动窗口 HOP

1.3 累积窗口 Cumulate

在实际应用中还会遇到这样一类需求:我们的统计周期可能较长,因此希望中间每隔一段时间就输出一次当前的统计值;

与滑动窗口不同的是,在一个统计周期内,我们会多次输出统计值,它们应该是不断叠加累积的。这种特殊的窗口就叫作"累积窗口"(Cumulate Window),它会在一定的统计周期内进行累积计算。

累积窗口中有两个核心的参数:最大窗口长度(max window size)和累积步长(step)。

所谓的最大窗口长度其实就是我们所说的"统计周期",最终目的就是统计这段时间内的数据。开始时,创建的第一个窗口大小就是步长 step;之后的每个窗口都会在之前的基础上再扩展 step 的长度,直到达到最大窗口长度。在 SQL 中可以用 CUMULATE()函数来定义。

eventTime

测试数据如下:

复制代码
{"username":"zs","price":20,"event_time":"2025-07-17 10:10:10"}
{"username":"zs","price":15,"event_time":"2025-07-17 10:10:30"}
{"username":"zs","price":20,"event_time":"2025-07-17 10:10:40"}
{"username":"zs","price":20,"event_time":"2025-07-17 10:11:03"}
{"username":"zs","price":20,"event_time":"2025-07-17 10:11:04"}
{"username":"zs","price":20,"event_time":"2025-07-17 10:12:04"}
{"username":"zs","price":20,"event_time":"2025-07-17 11:12:04"}
{"username":"zs","price":20,"event_time":"2025-07-17 11:12:04"}
{"username":"zs","price":20,"event_time":"2025-07-17 12:12:04"}
{"username":"zs","price":20,"event_time":"2025-07-18 12:12:04"}

需求:每隔1分钟统计这1分钟的每个用户的总消费金额和消费次数

需要用到滚动窗口

编写好sql:

复制代码
CREATE TABLE table1 (
  `username` string,
  `price` int,
  `event_time` TIMESTAMP(3),
  watermark for event_time as event_time - interval '3' second
) WITH (
  'connector' = 'kafka',
  'topic' = 'topic1',
  'properties.bootstrap.servers' = 'bigdata01:9092',
  'properties.group.id' = 'g1',
  'scan.startup.mode' = 'latest-offset',
  'format' = 'json'
);

编写sql:
select 
   window_start,
   window_end,
   username,
   count(1) zongNum,
   sum(price) totalMoney 
   from table(TUMBLE(TABLE table1, DESCRIPTOR(event_time), INTERVAL '60' second))
group by window_start,window_end,username;

分享一个错误:

Exception in thread "main" org.apache.flink.table.api.ValidationException: SQL validation failed. The window function TUMBLE(TABLE table_name, DESCRIPTOR(timecol), datetime interval) requires the timecol is a time attribute type, but is VARCHAR(2147483647).

说明创建窗口的时候,使用的字段不是时间字段,需要写成时间字段TIMESTAMP(3),使用了eventtime需要添加水印,否则报错。

需求:按照滚动窗口和EventTime进行统计,每隔1分钟统计每个人的消费总额是多少

java 复制代码
package com.bigdata.flinksql.day02;

import org.apache.flink.api.common.RuntimeExecutionMode;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.table.api.bridge.java.StreamTableEnvironment;

/**
 * @基本功能:
 * @program:FlinkDemo2
 * @author: 闫哥
 * @create:2025-04-20 10:59:41
 **/
public class _005Demo {

    public static void main(String[] args) throws Exception {

        //1. env-准备环境
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        env.setParallelism(1);

        StreamTableEnvironment tEnv = StreamTableEnvironment.create(env);
        tEnv.executeSql("CREATE TABLE table1 (\n" +
                "  `username` string,\n" +
                "  `price` int,\n" +
                "  `event_time` TIMESTAMP(3),\n" +
                "  watermark for event_time as event_time - interval '3' second\n" +
                ") WITH (\n" +
                "  'connector' = 'kafka',\n" +
                "  'topic' = 'topic1',\n" +
                "  'properties.bootstrap.servers' = 'node01:9092',\n" +
                "  'properties.group.id' = 'g1',\n" +
                "  'scan.startup.mode' = 'latest-offset',\n" +
                "  'format' = 'json'\n" +
                ")");

        // 编写sql语句

        tEnv.executeSql("select \n" +
                "   window_start,\n" +
                "   window_end,\n" +
                "   username,\n" +
                "   count(1) zongNum,\n" +
                "   sum(price) totalMoney \n" +
                "   from table(TUMBLE(TABLE table1, DESCRIPTOR(event_time), INTERVAL '60' second))\n" +
                "group by window_start,window_end,username").print();





    }
}
复制代码
注意:在本地运行时,默认的并行度是和你的cpu核数挂钩的,所以,为了快速看到结果,需要将并行度设置为1
测试数据时,因为我们这个是eventTime,所以测试数据时,假如第一条数据是:
{"username":"zs","price":20,"event_time":"2025-07-17 10:10:10"}
说明第一个窗口是 2025-07-17 10:10:00 ~ 2025-07-17 10:11:00
因为有水印,水印时间是3秒,所以,要想触发第一个窗口有结果,必须出现条件是:
{"username":"zs","price":20,"event_time":"2025-07-17 10:11:03"}

所以第一批测试数据如下:
{"username":"zs","price":20,"event_time":"2025-07-17 10:10:10"}
{"username":"zs","price":15,"event_time":"2025-07-17 10:10:30"}
{"username":"zs","price":20,"event_time":"2025-07-17 10:10:40"}
{"username":"zs","price":20,"event_time":"2025-07-17 10:11:03"}

结果如下:

假如你测试数据如下:

复制代码
{"username":"zs","price":20,"event_time":"2025-07-17 10:10:10"}
{"username":"zs","price":15,"event_time":"2025-07-17 10:10:30"}
{"username":"zs","price":20,"event_time":"2025-07-17 10:10:40"}
{"username":"zs","price":20,"event_time":"2025-07-17 10:11:03"}
{"username":"zs","price":20,"event_time":"2025-07-17 10:11:04"}
{"username":"zs","price":20,"event_time":"2025-07-17 10:12:04"}
{"username":"zs","price":20,"event_time":"2025-07-17 11:12:04"}
{"username":"zs","price":20,"event_time":"2025-07-17 11:12:04"}
{"username":"zs","price":20,"event_time":"2025-07-17 12:12:04"}
{"username":"zs","price":20,"event_time":"2025-07-18 12:12:04"}

统计结果如下:

测试一下滑动窗口,每隔10秒钟,计算前1分钟的数据:

java 复制代码
package com.bigdata.flinksql.day02;

import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.table.api.bridge.java.StreamTableEnvironment;

/**
 * @基本功能:
 * @program:FlinkDemo2
 * @author: 闫哥
 * @create:2025-04-20 10:59:41
 **/
public class _006Demo {

    public static void main(String[] args) throws Exception {

        //1. env-准备环境
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        env.setParallelism(1);

        StreamTableEnvironment tEnv = StreamTableEnvironment.create(env);
        tEnv.executeSql("CREATE TABLE table1 (\n" +
                "  `username` string,\n" +
                "  `price` int,\n" +
                "  `event_time` TIMESTAMP(3),\n" +
                "  watermark for event_time as event_time - interval '3' second\n" +
                ") WITH (\n" +
                "  'connector' = 'kafka',\n" +
                "  'topic' = 'topic1',\n" +
                "  'properties.bootstrap.servers' = 'node01:9092',\n" +
                "  'properties.group.id' = 'g1',\n" +
                "  'scan.startup.mode' = 'latest-offset',\n" +
                "  'format' = 'json'\n" +
                ")");

        // 编写sql语句

        tEnv.executeSql("select \n" +
                "   window_start,\n" +
                "   window_end,\n" +
                "   username,\n" +
                "   count(1) zongNum,\n" +
                "   sum(price) totalMoney \n" +
                "   from table(HOP(TABLE table1, DESCRIPTOR(event_time), INTERVAL '10' second,INTERVAL '60' second))\n" +
                "group by window_start,window_end,username").print();





    }
}

结果如图所示:

当我输入第一条数据的时候:

复制代码
{"username":"zs","price":20,"event_time":"2025-07-17 10:10:10"}
它的窗口结束时间是10:10:20秒,往前推1分钟,10:09:20,所以第一个窗口就是
10:09:20~10:10:20   这个窗口的触发时间是10:10:23
第二条数据是:
{"username":"zs","price":15,"event_time":"2025-07-17 10:10:30"}
它的时间是10:10:30 >=10:10:23 所以第一条被触发啦

processTime

测试数据:

复制代码
{"username":"zs","price":20}
{"username":"lisi","price":15}
{"username":"lisi","price":20}
{"username":"zs","price":20}
{"username":"zs","price":20}
{"username":"zs","price":20}
{"username":"zs","price":20}

package com.bigdata.flinksql.day02;

import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.table.api.bridge.java.StreamTableEnvironment;

/**
 * @基本功能:
 * @program:FlinkDemo2
 * @author: 闫哥
 * @create:2025-04-20 10:59:41
 **/
public class _007Demo {

    public static void main(String[] args) throws Exception {

        //1. env-准备环境
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        env.setParallelism(1);

        StreamTableEnvironment tEnv = StreamTableEnvironment.create(env);
        tEnv.executeSql("CREATE TABLE table1 (\n" +
                "  `username` string,\n" +
                "  `price` int,\n" +
                "  `event_time` as proctime()\n" +
                ") WITH (\n" +
                "  'connector' = 'kafka',\n" +
                "  'topic' = 'topic1',\n" +
                "  'properties.bootstrap.servers' = 'node01:9092',\n" +
                "  'properties.group.id' = 'g1',\n" +
                "  'scan.startup.mode' = 'latest-offset',\n" +
                "  'format' = 'json'\n" +
                ")");

        // 编写sql语句

        tEnv.executeSql("select \n" +
                "   window_start,\n" +
                "   window_end,\n" +
                "   username,\n" +
                "   count(1) zongNum,\n" +
                "   sum(price) totalMoney \n" +
                "   from table(TUMBLE(TABLE table1, DESCRIPTOR(event_time), INTERVAL '60' second))\n" +
                " group by window_start,window_end,username").print();

    }
}

计算结果: 等到正分钟 00 秒的时候会出结果。

结果需要等1分钟,才能显示出来,不要着急!

窗口分为滚动和滑动,时间分为事件时间和处理时间,两两组合,4个案例。

以下是滑动窗口+处理时间:

复制代码
package com.bigdata.flinksql.day02;

import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.table.api.bridge.java.StreamTableEnvironment;

/**
 * @基本功能:
 * @program:FlinkDemo2
 * @author: 闫哥
 * @create:2025-04-20 10:59:41
 **/
public class _008Demo {

    public static void main(String[] args) throws Exception {

        //1. env-准备环境
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        env.setParallelism(1);

        StreamTableEnvironment tEnv = StreamTableEnvironment.create(env);
        tEnv.executeSql("CREATE TABLE table1 (\n" +
                "  `username` string,\n" +
                "  `price` int,\n" +
                "  `event_time` as proctime() \n" +
                ") WITH (\n" +
                "  'connector' = 'kafka',\n" +
                "  'topic' = 'topic1',\n" +
                "  'properties.bootstrap.servers' = 'node01:9092',\n" +
                "  'properties.group.id' = 'g1',\n" +
                "  'scan.startup.mode' = 'latest-offset',\n" +
                "  'format' = 'json'\n" +
                ")");

        // 编写sql语句

        tEnv.executeSql("select \n" +
                "   window_start,\n" +
                "   window_end,\n" +
                "   username,\n" +
                "   count(1) zongNum,\n" +
                "   sum(price) totalMoney \n" +
                "   from table(HOP(TABLE table1, DESCRIPTOR(event_time), INTERVAL '10' second,INTERVAL '60' second))\n" +
                "group by window_start,window_end,username").print();
    }
}

测试时假如你的控制台不出数据,触发不了,请进入如下操作:

1、重新创建一个新的 topic,分区数为 1

2、kafka 对接的 server,写全 bigdata01:9092,bigdata02:9092,bigdata03:9092

假如我是 11:41:42 秒发送的数据,等到 11:41:50 出结果

第一个结果是:

| +I | 2025-04-20 11:40:50.000 | 2025-04-20 11:41:50.000 | zs | 5 | 100 |

| +I | 2025-04-20 11:40:50.000 | 2025-04-20 11:41:50.000 | lisi | 2 | 35 |

最后一个结果应该结束时间是 11:42:40 才对:

| +I | 2025-04-20 11:40:50.000 | 2025-04-20 11:41:50.000 | zs | 5 | 100 |

| +I | 2025-04-20 11:40:50.000 | 2025-04-20 11:41:50.000 | lisi | 2 | 35 |

| +I | 2025-04-20 11:41:00.000 | 2025-04-20 11:42:00.000 | zs | 5 | 100 |

| +I | 2025-04-20 11:41:00.000 | 2025-04-20 11:42:00.000 | lisi | 2 | 35 |

| +I | 2025-04-20 11:41:10.000 | 2025-04-20 11:42:10.000 | zs | 5 | 100 |

| +I | 2025-04-20 11:41:10.000 | 2025-04-20 11:42:10.000 | lisi | 2 | 35 |

| +I | 2025-04-20 11:41:20.000 | 2025-04-20 11:42:20.000 | zs | 5 | 100 |

| +I | 2025-04-20 11:41:20.000 | 2025-04-20 11:42:20.000 | lisi | 2 | 35 |

| +I | 2025-04-20 11:41:30.000 | 2025-04-20 11:42:30.000 | zs | 5 | 100 |

| +I | 2025-04-20 11:41:30.000 | 2025-04-20 11:42:30.000 | lisi | 2 | 35 |

| +I | 2025-04-20 11:41:40.000 | 2025-04-20 11:42:40.000 | zs | 5 | 100 |

| +I | 2025-04-20 11:41:40.000 | 2025-04-20 11:42:40.000 | lisi | 2 | 35 |

相关推荐
Hello.Reader18 小时前
Flink JDBC Driver把 Flink SQL Gateway 变成“数据库”,让 BI / 工具 / 应用直接用 JDBC 跑 Flink SQL
数据库·sql·flink
Hello.Reader19 小时前
Flink Catalogs 元数据统一入口、JDBC/Hive/自定义 Catalog、Time Travel、Catalog Store 与监听器
大数据·hive·flink
Hello.Reader19 小时前
Flink Modules 把自定义函数“伪装成内置函数”,以及 Core/Hive/自定义模块的加载与解析顺序
大数据·hive·flink
zhangkaixuan4562 天前
Paimon Action Jar 实现机制分析
java·大数据·flink·paimon·datalake
java_logo2 天前
Apache Flink Docker 容器化部署指南
docker·flink·apache·apache flink·apache flink部署·flink部署文档·flink部署教程
Jackyzhe2 天前
Flink源码阅读:双流操作
大数据·flink
Jackeyzhe2 天前
Flink源码阅读:Checkpoint机制(下)
flink
面向Google编程2 天前
Flink源码阅读:双流操作
大数据·flink
Hello.Reader3 天前
Flink ML OneHotEncoder 把类别索引变成稀疏 one-hot 向量
python·机器学习·flink
Hello.Reader3 天前
Flink ML VectorAssembler 把多列特征“拼”成一个向量列(数值 + 向量都支持)
java·python·flink