Flink构造宽表实时入库案例介绍

  1. 安装包准备

|---------------------|
| Flink 1.15.4 安装包 |
| Flink cdc的mysql连接器 |
| Flink sql的sdb连接器 |
| MySQL驱动 |
| SDB驱动 |
| Flink jdbc的mysql连接器 |

  1. 入库流程图
  1. Flink安装部署

  2. 上传Flink压缩包到服务器,并解压

|----------------------------------------------------|
| tar -zxvf flink-1.14.5-bin-scala_2.11.tgz -C /opt/ |

  1. 复制依赖至Flink中

|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| cp sdb-flink-connector-3.4.8-jar-with-dependencies.jar /opt/flink-1.14.5/lib cp sequoiadb-driver-3.4.8.jre8.jar /opt/flink-1.14.5/lib cp flink-sql-connector-mysql-cdc-2.2.1.jar /opt/flink-1.14.5/lib cp flink-connector-jdbc_2.11-1.14.6.jar /opt/flink-1.14.5/lib |

  1. 修改flink-conf.yaml文件

|---------------------------------------------------------------------------------------------------------------------------------------------|
| vi conf/flink-conf.yaml ### 配置Master的机器名(IP地址) jobmanager.rpc.address: sdb1 ### 配置每个taskmanager 生成的临时文件夹 io.tmp.dirs: /opt/flink-1.14.5/tmp |

  1. 修改master文件

|------------------------------------------------|
| vi conf/masters #作为master的ip和端口号 upgrade1:8081 |

  1. 修改worker文件

|---------------------------------------------------|
| vi conf/workers #集群主机名 upgrade1 upgrade2 upgrade3 |

  1. 拷贝到集群其他机器

|---------------------------------------------------------------------------------------------------|
| scp -r /opt/flink-1.14.5 sdbadmin@upgrade2:/opt/ scp -r /opt/flink-1.14.5 sdbadmin@upgrade3:/opt/ |

  1. 启动flink集群

|------------------------------------------------------------|
| [sdbadmin@upgrade1 flink-1.14.5]$ ./bin/start-cluster.sh |

  1. 启动flink-SQL

|---------------------------------------------------------|
| [sdbadmin@upgrade1 flink-1.14.5]$ ./bin/sql-client.sh |

  1. 实时入库

编写造数程序进行造数

4.1 环境准备

4.1.1 开启mysql的binlog

  1. 创建binlog文件夹

|-------------------------------------------------------------------------------|
| [sdbadmin@upgrade1 mysql]$ mkdir /opt/sequoiasql/mysql/database/3306/binlog |

  1. 开启binlog

|------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| vim /opt/sequoiasql/mysql/database/3306/auto.cnf >>配置以下内容: log_bin=/opt/sequoiasql/mysql/database/3306/binlog binlog_format=ROW expire_logs_days=1 server_id=1 |

配置完成之后,重启mysql

|----------------------------------------------------------------------------------------------------------------------------|
| [sdbadmin@upgrade1 mysql] ./bin/sdb_mysql_ctl stop myinst \[sdbadmin@upgrade1 mysql\] ./bin/sdb_mysql_ctl start myinst |

4.1.2 创建mysql表

创建库

|-------------------------------------|
| create database sbtest; use sbtest; |

创建表

|--------------------------------------------------------------------------------------------------------------------------------------|
| CREATE TABLE sbtest1 ( id INT UNSIGNED AUTO_INCREMENT, uuid INT(10), name1 CHAR(120), age INT(4), time1 DATETIME, PRIMARY KEY(id) ); |

|--------------------------------------------------------------------------------------------------------------------------------------|
| CREATE TABLE sbtest2 ( id INT UNSIGNED AUTO_INCREMENT, uuid INT(10), name2 CHAR(120), age INT(4), time1 DATETIME, PRIMARY KEY(id) ); |

|--------------------------------------------------------------------------------------------------------------------------------------|
| CREATE TABLE sbtest3 ( id INT UNSIGNED AUTO_INCREMENT, uuid INT(10), name3 CHAR(120), age INT(4), time1 DATETIME, PRIMARY KEY(id) ); |

创建flink入库表

|------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| CREATE TABLE sbtest4 ( id INT UNSIGNED AUTO_INCREMENT, uuid INT(10), name1 CHAR(120), name2 CHAR(120), name3 CHAR(120), age INT(4), time1 DATETIME, PRIMARY KEY(id) ); |

4.1.3 创建flink映射表

需要用到flink-sql-connector-mysql-cdc-2.2.1.jar

|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| CREATE TABLE sbtest1_mysql ( id INT, uuid INT, name1 CHAR(120), age INT, time1 TIMESTAMP, PRIMARY KEY (id) NOT ENFORCED ) WITH ( 'connector' = 'mysql-cdc', 'hostname' = '192.168.223.135', 'port' = '3306', 'username' = 'root', 'password' = 'root', 'database-name' = 'sbtest', 'table-name' = 'sbtest1' ); |

|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| CREATE TABLE sbtest2_mysql ( id INT, uuid INT, name2 CHAR(120), age INT, time1 TIMESTAMP, PRIMARY KEY (id) NOT ENFORCED ) WITH ( 'connector' = 'mysql-cdc', 'hostname' = '192.168.223.135', 'port' = '3306', 'username' = 'root', 'password' = 'root', 'database-name' = 'sbtest', 'table-name' = 'sbtest2' ); |

|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| CREATE TABLE sbtest3_mysql ( id INT, uuid INT, name3 CHAR(120), age INT, time1 TIMESTAMP, PRIMARY KEY (id) NOT ENFORCED ) WITH ( 'connector' = 'mysql-cdc', 'hostname' = '192.168.223.135', 'port' = '3306', 'username' = 'root', 'password' = 'root', 'database-name' = 'sbtest', 'table-name' = 'sbtest3' ); |

创建flink --> mysql入库映射表

需要用到flink-connector-jdbc_2.11-1.14.6.jar

|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| CREATE TABLE sbtest4_mysql ( id BIGINT, uuid INT, name1 CHAR(120), name2 CHAR(120), name3 CHAR(120), age INT, time1 TIMESTAMP, PRIMARY KEY (id) NOT ENFORCED ) WITH ( 'connector' = 'jdbc', 'url' = 'jdbc:mysql://192.168.223.135:3306/sbtest', 'username' = 'root', 'password' = 'root', 'table-name' = 'sbtest4' ); |

创建flink --> mysql入库映射表

需要用到 sdb-flink-connector-3.4.8-jar-with-dependencies.jar

|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| CREATE TABLE sbtest_sdb ( id BIGINT, uuid INT, name1 CHAR(120), name2 CHAR(120), name3 CHAR(120), age INT, time1 TIMESTAMP, PRIMARY KEY (id) NOT ENFORCED ) WITH ( 'connector' = 'sequoiadb', 'bulksize' = '1', 'hosts' = '192.168.223.135:11810', 'collectionspace' = 'sbtest', 'collection' = 'sbtest4' ); |

4.2 MySQL实时入库

4.2.1 Flink left join

|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| select sdb1.id, sdb1.uuid, sdb1.name1, sdb2.name2, sdb3.name3, sdb1.age, sdb1.time1 from sbtest1_mysql sdb1 left join sbtest2_mysql sdb2 on sdb1.id = sdb2.id left join sbtest3_mysql sdb3 on sdb1.id = sdb3.id; |

4.2.2 mysql实时入库

|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| insert into sbtest4_mysql select sdb1.id, sdb1.uuid, sdb1.name1, sdb2.name2, sdb3.name3, sdb1.age, sdb1.time1 from sbtest1_mysql sdb1 left join sbtest2_mysql sdb2 on sdb1.id = sdb2.id left join sbtest3_mysql sdb3 on sdb1.id = sdb3.id; |

查看Flink任务

查看可以成功入库

4.3 SDB实时入库

4.3.1 Flink left join

|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| select sdb1.id, sdb1.uuid, sdb1.name1, sdb2.name2, sdb3.name3, sdb1.age, sdb1.time1 from sbtest1_mysql sdb1 left join sbtest2_mysql sdb2 on sdb1.id = sdb2.id left join sbtest3_mysql sdb3 on sdb1.id = sdb3.id; |

4.3.2 sdb实时入库

|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| insert into sbtest_sdb select sdb1.id, sdb1.uuid, sdb1.name1, sdb2.name2, sdb3.name3, sdb1.age, sdb1.time1 from sbtest1_mysql sdb1 left join sbtest2_mysql sdb2 on sdb1.id = sdb2.id left join sbtest3_mysql sdb3 on sdb1.id = sdb3.id; |

查看Flink任务

显示已经成功入库

相关推荐
智数研析社4 分钟前
9120 部 TMDb 高分电影数据集 | 7 列全维度指标 (评分 / 热度 / 剧情)+API 权威源 | 电影趋势分析 / 推荐系统 / NLP 建模用
大数据·人工智能·python·深度学习·数据分析·数据集·数据清洗
潘达斯奈基~18 分钟前
《大数据之路1》笔记2:数据模型
大数据·笔记
寻星探路32 分钟前
数据库造神计划第六天---增删改查(CRUD)(2)
java·大数据·数据库
翰林小院2 小时前
【大数据专栏】流式处理框架-Apache Fink
大数据·flink
懒虫虫~3 小时前
通过内存去重替换SQL中distinct,优化SQL查询效率
java·sql·慢sql治理
孟意昶3 小时前
Spark专题-第一部分:Spark 核心概述(2)-Spark 应用核心组件剖析
大数据·spark·big data
逛逛GitHub3 小时前
1 个神级智能问数工具,刚开源就 1500 Star 了。
sql·github
IT学长编程4 小时前
计算机毕业设计 基于Hadoop的健康饮食推荐系统的设计与实现 Java 大数据毕业设计 Hadoop毕业设计选题【附源码+文档报告+安装调试】
java·大数据·hadoop·毕业设计·课程设计·推荐算法·毕业论文
AAA修煤气灶刘哥4 小时前
Kafka 入门不踩坑!从概念到搭环境,后端 er 看完就能用
大数据·后端·kafka
Huhbbjs4 小时前
SQL 核心概念与实践总结
开发语言·数据库·sql