Flink CDC的使用

MySQL数据准备

sql 复制代码
create database if not exists test;
use test;
drop table if exists stu;
create table stu (id int primary key auto_increment, name varchar(100), age int);
insert into stu(name, age) values("张三",18);
insert into stu(name, age) values("李四",20);
insert into stu(name, age) values("王五",21);

注意:表必须有主键

开启MySQL binlog

修改MySQL配置,开启binlog

$ sudo vim /etc/my.cnf,添加如下设置

server-id = 1
log-bin=mysql-bin
binlog_format=row
binlog-do-db=test

注意:启用binlog的数据库,需根据实际情况作出修改

重启mysql

复制代码
$ sudo systemctl restart mysqld

代码开发

依赖

Flink CDC依赖

XML 复制代码
		<!--cdc 依赖-->
        <dependency>
            <groupId>com.ververica</groupId>
            <artifactId>flink-connector-mysql-cdc</artifactId>
            <version>2.4.0</version>
        </dependency>
        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-table-api-java-bridge</artifactId>
            <version>${flink.version}</version>
        </dependency>

完整依赖

XML 复制代码
    <properties>
        <maven.compiler.source>8</maven.compiler.source>
        <maven.compiler.target>8</maven.compiler.target>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
        <flink.version>1.17.1</flink.version>
        <flink-cdc.vesion>2.4.0</flink-cdc.vesion>
    </properties>

    <dependencies>
        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-streaming-java</artifactId>
            <version>${flink.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-clients</artifactId>
            <version>${flink.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-table-api-java-bridge</artifactId>
            <version>${flink.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-table-planner-loader</artifactId>
            <version>${flink.version}</version>
        </dependency>

        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-table-runtime</artifactId>
            <version>${flink.version}</version>
        </dependency>

        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-connector-files</artifactId>
            <version>${flink.version}</version>
        </dependency>

        <dependency>
            <groupId>mysql</groupId>
            <artifactId>mysql-connector-java</artifactId>
            <version>8.0.33</version>
        </dependency>

        <!--目前中央仓库还没有 jdbc的连接器,暂时用一个快照版本-->
        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-connector-jdbc</artifactId>
            <version>1.17-SNAPSHOT</version>
        </dependency>


        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-connector-files</artifactId>
            <version>${flink.version}</version>
            <scope>provided</scope>
        </dependency>

        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-connector-datagen</artifactId>
            <version>${flink.version}</version>
        </dependency>

        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-statebackend-rocksdb</artifactId>
            <version>${flink.version}</version>
<!--            <scope>provided</scope>-->
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-client</artifactId>
            <version>3.3.4</version>
<!--            <scope>provided</scope>-->
        </dependency>
        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-connector-kafka</artifactId>
            <version>${flink.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-statebackend-changelog</artifactId>
            <version>${flink.version}</version>
            <scope>runtime</scope>
        </dependency>

        <dependency>
            <groupId>com.google.code.findbugs</groupId>
            <artifactId>jsr305</artifactId>
            <version>1.3.9</version>
        </dependency>


        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-table-api-java-bridge</artifactId>
            <version>${flink.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-table-planner-loader</artifactId>
            <version>${flink.version}</version>
        </dependency>

        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-table-runtime</artifactId>
            <version>${flink.version}</version>
        </dependency>

        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-connector-files</artifactId>
            <version>${flink.version}</version>
        </dependency>

        <!--cdc 依赖-->
        <dependency>
            <groupId>com.ververica</groupId>
            <artifactId>flink-connector-mysql-cdc</artifactId>
            <version>${flink-cdc.vesion}</version>
        </dependency>


    </dependencies>

Flink代码

Flink CDC捕获MySQL变更数据(增加、修改、删除),输出到控制台。

java 复制代码
import com.ververica.cdc.connectors.mysql.source.MySqlSource;
import com.ververica.cdc.connectors.mysql.table.StartupOptions;
import com.ververica.cdc.debezium.JsonDebeziumDeserializationSchema;
import org.apache.flink.api.common.eventtime.WatermarkStrategy;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;

public class FlinkCDCDemo {
    public static void main(String[] args) throws Exception {
        // 环境
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        env.setParallelism(4);

        // 数据源
        MySqlSource<String> mySqlSource = MySqlSource.<String>builder()
                .hostname("node4")
                .port(3306)
                .username("root")
                .password("000000")
                .databaseList("test")
                .tableList("test.stu")
                .deserializer(new JsonDebeziumDeserializationSchema())
                .startupOptions(StartupOptions.initial())
                .build();
        DataStreamSource<String> dataStreamSource = env.fromSource(mySqlSource, WatermarkStrategy.noWatermarks(),
                "mysql_source").setParallelism(1);


        // 处理数据

        // 输出数据
        dataStreamSource.print();

        // 执行
        env.execute();


    }
}

运行程序,确保程序无报错,看到如下输出:

18:58:51,826 INFO io.debezium.connector.mysql.MySqlStreamingChangeEventSource - Keepalive thread is running

测试

添加数据

mysql添加数据

复制代码
mysql> insert into stu(name, age) values("赵六",23);

IDEA控制台输出

复制代码
{"before":null,"after":{"id":4,"name":"赵六","age":23},"source":{"version":"1.9.7.Final","connector":"mysql","name":"mysql_binlog_source","ts_ms":1719831654000,"snapshot":"false","db":"test","sequence":null,"table":"stu","server_id":1,"gtid":null,"file":"mysql-bin.000001","pos":2300,"row":0,"thread":13,"query":null},"op":"c","ts_ms":1719831654692,"transaction":null}

格式化输出

复制代码
{
    "before": null,
    "after": {
        "id": 4,
        "name": "赵六",
        "age": 23
    },
    "source": {
        "version": "1.9.7.Final",
        "connector": "mysql",
        "name": "mysql_binlog_source",
        "ts_ms": 1719831654000,
        "snapshot": "false",
        "db": "test",
        "sequence": null,
        "table": "stu",
        "server_id": 1,
        "gtid": null,
        "file": "mysql-bin.000001",
        "pos": 2300,
        "row": 0,
        "thread": 13,
        "query": null
    },
    "op": "c",
    "ts_ms": 1719831654692,
    "transaction": null
}

关注before、after符合增加数据的逻辑,op为c表示添加数据

修改数据

mysql修改数据

复制代码
mysql> update stu set name="zl", age=19 where name="赵六";

IDEA控制台输出

复制代码
{"before":{"id":4,"name":"赵六","age":23},"after":{"id":4,"name":"zl","age":19},"source":{"version":"1.9.7.Final","connector":"mysql","name":"mysql_binlog_source","ts_ms":1719831987000,"snapshot":"false","db":"test","sequence":null,"table":"stu","server_id":1,"gtid":null,"file":"mysql-bin.000001","pos":2604,"row":0,"thread":13,"query":null},"op":"u","ts_ms":1719831987238,"transaction":null}

格式化输出

复制代码
{
    "before": {
        "id": 4,
        "name": "赵六",
        "age": 23
    },
    "after": {
        "id": 4,
        "name": "zl",
        "age": 19
    },
    "source": {
        "version": "1.9.7.Final",
        "connector": "mysql",
        "name": "mysql_binlog_source",
        "ts_ms": 1719831987000,
        "snapshot": "false",
        "db": "test",
        "sequence": null,
        "table": "stu",
        "server_id": 1,
        "gtid": null,
        "file": "mysql-bin.000001",
        "pos": 2604,
        "row": 0,
        "thread": 13,
        "query": null
    },
    "op": "u",
    "ts_ms": 1719831987238,
    "transaction": null
}

关注before、after符合更新的逻辑,op为u表示更新数据

删除数据

mysql删除数据

复制代码
mysql> delete from stu where id=4;

IDEA控制台输出

复制代码
{"before":{"id":4,"name":"zl","age":19},"after":null,"source":{"version":"1.9.7.Final","connector":"mysql","name":"mysql_binlog_source","ts_ms":1719832151000,"snapshot":"false","db":"test","sequence":null,"table":"stu","server_id":1,"gtid":null,"file":"mysql-bin.000001","pos":2913,"row":0,"thread":13,"query":null},"op":"d","ts_ms":1719832151198,"transaction":null}
​

格式化输出

复制代码
{
    "before": {
        "id": 4,
        "name": "zl",
        "age": 19
    },
    "after": null,
    "source": {
        "version": "1.9.7.Final",
        "connector": "mysql",
        "name": "mysql_binlog_source",
        "ts_ms": 1719832151000,
        "snapshot": "false",
        "db": "test",
        "sequence": null,
        "table": "stu",
        "server_id": 1,
        "gtid": null,
        "file": "mysql-bin.000001",
        "pos": 2913,
        "row": 0,
        "thread": 13,
        "query": null
    },
    "op": "d",
    "ts_ms": 1719832151198,
    "transaction": null
}

关注before、after符合删除的逻辑,op为d表示删除数据

完成!enjoy it!

相关推荐
kakwooi1 小时前
Hadoop---MapReduce(3)
大数据·hadoop·mapreduce
数新网络1 小时前
《深入浅出Apache Spark》系列②:Spark SQL原理精髓全解析
大数据·sql·spark
昨天今天明天好多天6 小时前
【数据仓库】
大数据
油头少年_w7 小时前
大数据导论及分布式存储HadoopHDFS入门
大数据·hadoop·hdfs
Elastic 中国社区官方博客8 小时前
释放专利力量:Patently 如何利用向量搜索和 NLP 简化协作
大数据·数据库·人工智能·elasticsearch·搜索引擎·自然语言处理
力姆泰克8 小时前
看电动缸是如何提高农机的自动化水平
大数据·运维·服务器·数据库·人工智能·自动化·1024程序员节
力姆泰克8 小时前
力姆泰克电动缸助力农业机械装备,提高农机的自动化水平
大数据·服务器·数据库·人工智能·1024程序员节
QYR市场调研8 小时前
自动化研磨领域的革新者:半自动与自动自磨机的技术突破
大数据·人工智能
半部论语9 小时前
第三章:TDengine 常用操作和高级功能
大数据·时序数据库·tdengine
EasyGBS10 小时前
国标GB28181公网直播EasyGBS国标GB28181软件管理解决方案
大数据·网络·音视频·媒体·视频监控·gb28181