上一篇讲到,我们引入debuzium cdc 中间件的业务背景,这个章节将介绍debuzium 入门安装,以及捕捉数据 上一篇:# 从0到1:海量数据下查询优化之旅
目录
- debuzium 介绍
- 业务架构
- debuzium 安装
- kafka kafka-ui 安装
- debuzium 创建连接器
- 监听mysql binlog消息
- 测试
debuzium (得不贼母) 是什么?
官网地址:debezium.io/
总结一下就是:类似于canal的CDC工具,可以监听数据库(常用的数据库都支持)事件,以流的方式给其他系统处理。
定义 (来自AI的介绍)
Debezium是一个开源的分布式平台,用于捕获和传输数据库的更改事件。它可以监控数据库的事务日志,并将变更事件以流的形式传递给其他系统进行消费和处理。
Debezium支持多种主流的关系型数据库,如MySQL、PostgreSQL、Oracle等,并且可以实时地捕获数据库中的插入、更新和删除操作。通过使用Debezium,你可以轻松地构建实时数据管道、数据同步和事件驱动的架构。
Debezium的工作原理是通过连接到数据库的事务日志来捕获变更事件。它会监听数据库的事务提交,并将变更事件记录到一个专门的消息队列或者日志文件中。其他系统可以订阅这些事件,实时地获取数据库的变更信息,并根据需要进行相应的处理。
使用Debezium可以带来许多好处,包括实时数据同步、事件驱动的架构、微服务集成、数据仓库加载等。它提供了可靠的、低延迟的数据传输机制,使得各个系统之间可以实现高效的数据交互和协同工作。
debuzium 架构
debuzium 可以监听 mysql,postgreSQL,oracle等主流数据库的变更消息,变更之后,可以将消息发送到kafka中,再由别的中间件进行介入
业务架构
debuzium 监听mysql binlog 日志,debuzium将消息发送到kafka后,由etl-service服务从kafka中批量拉取消息,将消息组装成es数据模型插入到es中,这样就完成一个数据库到es的数据流转过程。
kafka,kafka-ui 集群方式安装
yaml
# Copyright VMware, Inc.
# SPDX-License-Identifier: APACHE-2.0
version: "2"
services:
kafka-0:
image: docker.io/bitnami/kafka:3.5
container_name: kafka-0
ports:
- 19092:9092
- 19093:9093
environment:
# KRaft settings
- KAFKA_ENABLE_KRAFT=yes
- KAFKA_CFG_NODE_ID=0
- KAFKA_CFG_PROCESS_ROLES=controller,broker
- KAFKA_CFG_CONTROLLER_QUORUM_VOTERS=0@kafka-0:9093,1@kafka-1:9093,2@kafka-2:9093
- KAFKA_KRAFT_CLUSTER_ID=abcdefghijklmnopqrstuv
# Listeners
- KAFKA_CFG_LISTENERS=PLAINTEXT://:9092,CONTROLLER://:9093
- KAFKA_CFG_ADVERTISED_LISTENERS=PLAINTEXT://:9092
- KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP=PLAINTEXT:PLAINTEXT,CONTROLLER:PLAINTEXT
- KAFKA_CFG_CONTROLLER_LISTENER_NAMES=CONTROLLER
- KAFKA_CFG_INTER_BROKER_LISTENER_NAME=PLAINTEXT
volumes:
- ./data/kafka0:/bitnami/kafka
networks:
- mx-wk
kafka-1:
image: docker.io/bitnami/kafka:3.5
container_name: kafka-1
ports:
- 29092:9092
- 29093:9093
environment:
# KRaft settings
- KAFKA_ENABLE_KRAFT=yes
- KAFKA_CFG_NODE_ID=1
- KAFKA_CFG_PROCESS_ROLES=controller,broker
- KAFKA_CFG_CONTROLLER_QUORUM_VOTERS=0@kafka-0:9093,1@kafka-1:9093,2@kafka-2:9093
- KAFKA_KRAFT_CLUSTER_ID=abcdefghijklmnopqrstuv
# Listeners
- KAFKA_CFG_LISTENERS=PLAINTEXT://:9092,CONTROLLER://:9093
- KAFKA_CFG_ADVERTISED_LISTENERS=PLAINTEXT://:9092
- KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP=PLAINTEXT:PLAINTEXT,CONTROLLER:PLAINTEXT
- KAFKA_CFG_CONTROLLER_LISTENER_NAMES=CONTROLLER
- KAFKA_CFG_INTER_BROKER_LISTENER_NAME=PLAINTEXT
volumes:
- ./data/kafka1:/bitnami/kafka
networks:
- mx-wk
kafka-2:
image: docker.io/bitnami/kafka:3.5
container_name: kafka-2
ports:
- 39092:9092
- 39093:9093
environment:
# KRaft settings
- KAFKA_ENABLE_KRAFT=yes
- KAFKA_CFG_NODE_ID=2
- KAFKA_CFG_PROCESS_ROLES=controller,broker
- KAFKA_CFG_CONTROLLER_QUORUM_VOTERS=0@kafka-0:9093,1@kafka-1:9093,2@kafka-2:9093
- KAFKA_KRAFT_CLUSTER_ID=abcdefghijklmnopqrstuv
# Listeners
- KAFKA_CFG_LISTENERS=PLAINTEXT://:9092,CONTROLLER://:9093
- KAFKA_CFG_ADVERTISED_LISTENERS=PLAINTEXT://:9092
- KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP=PLAINTEXT:PLAINTEXT,CONTROLLER:PLAINTEXT
- KAFKA_CFG_CONTROLLER_LISTENER_NAMES=CONTROLLER
- KAFKA_CFG_INTER_BROKER_LISTENER_NAME=PLAINTEXT
volumes:
- ./data/kafka2:/bitnami/kafka
networks:
- mx-wk
kafkaui:
image: provectuslabs/kafka-ui:latest
ports:
- 7080:8080
depends_on:
- kafka-0
- kafka-1
- kafka-2
environment:
- KAFKA_CLUSTERS_0_NAME=mx
- KAFKA_CLUSTERS_0_BOOTSTRAPSERVERS=10.16.1.1:19092,10.16.1.1:29092,10.16.1.1:9092
networks:
- mx-wk
networks:
mx-wk:
external: true
name: mx-wk
- 执行命令:docker-compose -f docker.yaml up -d
- 打开kafka-ui,如果安装成功则显示一下页面
debuzium 安装
yaml
version: '2'
services:
# apicurio:
# image: apicurio/apicurio-registry-mem:2.2.5.Final
# networks:
# - net_kafka
# ports:
# - 8095:8080
# links:
# - kafka
connect:
image: quay.io/debezium/connect:2.3
networks:
- mx-wk
ports:
- 7750:8083
environment:
## kafka链接地址
- BOOTSTRAP_SERVERS=10.16.1.1:19092,10.16.1.1:29092,10.16.1.1:9092
- GROUP_ID=1
- CONFIG_STORAGE_TOPIC=mx.connect_configs
- OFFSET_STORAGE_TOPIC=mx.connect_offsets
- STATUS_STORAGE_TOPIC=mx.connect_statuses
# - KEY_CONVERTER=io.apicurio.registry.utils.converter.AvroConverter
# - VALUE_CONVERTER=io.apicurio.registry.utils.converter.AvroConverter
# - CONNECT_KEY_CONVERTER=io.apicurio.registry.utils.converter.AvroConverter
# - CONNECT_KEY_CONVERTER_APICURIO.REGISTRY_URL=http://172.17.12.251:8095/apis/registry/v2
# - CONNECT_KEY_CONVERTER_APICURIO_REGISTRY_AUTO-REGISTER=true
# - CONNECT_KEY_CONVERTER_APICURIO_REGISTRY_FIND-LATEST=true
# - CONNECT_VALUE_CONVERTER=io.apicurio.registry.utils.converter.AvroConverter
# - CONNECT_VALUE_CONVERTER_APICURIO_REGISTRY_URL=http://172.17.12.251:8095/apis/registry/v2
# - CONNECT_VALUE_CONVERTER_APICURIO_REGISTRY_AUTO-REGISTER=true
# - CONNECT_VALUE_CONVERTER_APICURIO_REGISTRY_FIND-LATEST=true
# - CONNECT_SCHEMA_NAME_ADJUSTMENT_MODE=avro
# - ENABLE_APICURIO_CONVERTERS=true
# - Xmx=1G
networks:
mx-wk:
external: true
name: mx-wk
执行:docker-compose -f debezium.yaml up -d
创建debuzium 连接器
debuzium 安装成功后,想要捕捉mysql数据库变更,需要请求debuzium 对外暴露的接口创建连接器,推荐一个业务线一个连接器,执行下面创建一个新的connector的请求(注意数据库连接信息、监听表信息、kafka连接信息),执行成功,则会正确返回,再执行获取指定connector状态的请求,如果得到一下结果,则说明连接器创建成功
ruby
### -- 返回活跃的connectors
GET http://127.0.0.1:7750/connectors/
### -- 创建一个新的connector
POST http://127.0.0.1:7750/connectors
Content-Type: application/json
{
"name": "connector-test-mx",
"config": {
"connector.class": "io.debezium.connector.mysql.MySqlConnector",
"database.hostname": "127.0.0.1",
"database.port": "3306",
"database.user": "root",
"database.password": "12345678",
"database.server.id": "10001",
"database.server.name": "mx",
"database.include.list": "nacos_devtest",
"topic.prefix": "mx",
"table.include.list": "nacos_devtest.users",
"snapshot.mode": "schema_only",
"database.history.kafka.bootstrap.servers": "10.16.1.1:19092,10.16.1.1:29092,10.16.1.1:9092",
"schema.history.internal.kafka.bootstrap.servers": "10.16.1.1:19092,10.16.1.1:29092,10.16.1.1:9092",
"schema.history.internal.kafka.topic": "mx.schemahistory.nacos_devtest",
"database.history.kafka.topic": "mx.mx_history_schema",
"include.schema.changes": "true"
}
}
###
DELETE http://127.0.0.1:7750/connectors/connector-test-mx
### -- 获取指定connetor的信息
GET http://127.0.0.1:7750/connectors/connector-test-mx
### -- 获取指定connector的配置信息
GET http://127.0.0.1:7750/connectors/connector-test-mx/config
### -- 更新指定connector的配置信息
PUT http://127.0.0.1:7750/connectors/connector-test-mx/config
Content-Type: application/json
{
"connector.class": "io.debezium.connector.mysql.MySqlConnector",
"database.hostname": "127.0.0.1",
"database.port": "3306",
"database.user": "root",
"database.password": "12345678",
"database.server.id": "10001",
"database.server.name": "mx",
"database.include.list": "nacos_devtest",
"topic.prefix": "mx",
"table.include.list": "nacos_devtest.users",
"snapshot.mode": "schema_only",
"database.history.kafka.bootstrap.servers": "10.16.1.1:19092,10.16.1.1:29092,10.16.1.1:9092",
"schema.history.internal.kafka.topic": "mx.schemahistory.nacos_devtest",
"database.history.kafka.topic": "mx.mx_history_schema",
"include.schema.changes": "true"
}
### -- 获取指定connector状态
GET http://127.0.0.1:7750/connectors/connector-test-mx/status
### -- 获取指定connector正在运行的task
GET http://127.0.0.1:7750/connectors/connector-test-mx/tasks
### -- 获取指定connector的task状态信息
GET http://127.0.0.1:7750/connectors/connector-test-mx/tasks/0/status
### -- 暂停connector和它的task
PUT http://127.0.0.1:7750/connectors/connector-test-mx/pause
### -- 恢复一个被暂停的connector
PUT http://127.0.0.1:7750/connectors/connector-test-mx/resume
### -- 重启一个connector
POST http://127.0.0.1:7750/connectors/connector-test-mx/restart
### -- 重启一个task
POST http://127.0.0.1:7750/connectors/connector-test-mx/tasks/0/restart
### -- 删除一个connector
DELETE http://127.0.0.1:7750/connectors/connector-test-mx
测试
- 手动变更监听的表
- 从控制台中查看是否有新的消息
- 可以看到kafka能收到变更的消息,至此,我们的debuzium 安装成功,后面可以消费这些消息做一些业务上的处理
总结
本章节带大家入门了debuzium, 从介绍中间接,然后安装,使用,最后成功安装并测试成功
如果文章对你帮助,动动大家发财的小手帮忙点点赞