OGG同步Oracle到Kafka

1、OGG软件安装

OGG软件的安装不做赘述,需要注意的是目标端需要安装OGG_BigData版本。

2、OGG配置

2.1、oracle数据库配置(源端)

数据库开启归档模式

sql 复制代码
SQL> startup mount
SQL> alter database archivelog;
SQL> alter database open;

开启强制记录日志模式

sql 复制代码
SQL> alter database force logging;

开启附加日志

sql 复制代码
SQL> alter database add supplemental log data;

查看结果:

sql 复制代码
SQL> Select SUPPLEMENTAL_LOG_DATA_MIN, FORCE_LOGGING from v$database;

SUPPLEME FOR
-------- ---
YES      YES

切换日志使附加日志生效

sql 复制代码
SQL> alter system switch logfile;

System altered.

在数据库上创建ogg用户并赋权,用于OGG同步

sql 复制代码
SQL> create tablespace ogg datafile '/u01/oraprod/db/apps_st/data/ogg01.dbf' size 300m autoextend on ;
SQL> create user ogg identified by ogg default tablespace ogg;
SQL> grant connect,resource,create session,alter session  to ogg;
SQL> grant flashback any table to ogg;  (当日志中没有足够信息时,可以flashback或直接读取数据库进行查询)
SQL> exec dbms_goldengate_auth.grant_admin_privilege('OGG');    (授予用户GOLDENGATE成为OGG管理员所需的权限)

开启enable_goldengate_replication

sql 复制代码
SQL> alter system set enable_goldengate_replication = true scope=both;     

2.2、OGG进程配置

2.2.1、初始化配置

进入命令界面,创建Goldengate 子目录(源、目标都要操作)

sql 复制代码
[oracle@test1 ogg]$ ./ggsci 

Oracle GoldenGate Command Interpreter for Oracle
Version 12.2.0.2.2 OGGCORE_12.2.0.2.0_PLATFORMS_170630.0419_FBO
Linux, x64, 64bit (optimized), Oracle 11g on Jun 30 2017 14:42:26
Operating system character set identified as UTF-8.

Copyright (C) 1995, 2017, Oracle and/or its affiliates. All rights reserved.



GGSCI (test1) 1> create subdirs

Creating subdirectories under current directory /u01/app/ogg

Parameter files                /u01/app/ogg/dirprm: already exists
Report files                   /u01/app/ogg/dirrpt: already exists
Checkpoint files               /u01/app/ogg/dirchk: already exists
Process status files           /u01/app/ogg/dirpcs: already exists
SQL script files               /u01/app/ogg/dirsql: already exists
Database definitions files     /u01/app/ogg/dirdef: already exists
Extract data files             /u01/app/ogg/dirdat: already exists
Temporary files                /u01/app/ogg/dirtmp: already exists
Credential store files         /u01/app/ogg/dircrd: already exists
Masterkey wallet files         /u01/app/ogg/dirwlt: already exists
Dump files                     /u01/app/ogg/dirdmp: already exists

2.2.2、源端checkpoint表及MGR进程配置

配置全局参数文件(源、目标都要操作):

sql 复制代码
源端:
GGSCI (oracle11g) 1> dblogin userid ogg password ogg
Successfully logged into database.

GGSCI (oracle11g as ogg@orcl) 2> edit param ./globals
GGSCHEMA ogg
CHECKPOINTTABLE ogg.ggschkpt
SYSLOG NONE

目标端:
GGSCI (oracle11g as ogg@orcl) 2> edit param ./globals
CHECKPOINTTABLE ogg.ggschkpt

用ogg用户连接数据库(源端):

sql 复制代码
GGSCI (test1) 4> dblogin userid ogg password ogg;
Successfully logged into database.

在数据库生成checkpoint表:
GGSCI (test1 as goldengate@test1) 6> add CHECKPOINTTABLE ogg.ggschkpt                                   
Successfully created checkpoint table goldengate.ggschkpt.   

编辑MGR进程参数(源、目标都要操作):

sql 复制代码
源端:
GGSCI (test1 as goldengate@test1) 7> edit params mgr
PORT 7809
DYNAMICPORTLIST 7810-7899
USERID ogg,PASSWORD ogg
AUTORESTART ER *,RETRIES 3, WAITMINUTES 5, RESETMINUTES 60
LAGREPORTMINUTES 10
LAGCRITICALMINUTES 10
PURGEOLDEXTRACTS ./dirdat/*, USECHECKPOINTS, MINKEEPHOURS 12
PURGEDDLHISTORY MINKEEPDAYS 7, MAXKEEPDAYS 14, FREQUENCYHOURS 3
PURGEMARKERHISTORY MINKEEPDAYS 7, MAXKEEPDAYS 14, FREQUENCYHOURS 3

目标端:
GGSCI (test1 as goldengate@test1) 7> edit params mgr
PORT 7809
DYNAMICPORTLIST 7810-7899
AUTORESTART ER *,RETRIES 3, WAITMINUTES 5, RESETMINUTES 60
LAGREPORTMINUTES 10
LAGCRITICALMINUTES 10
PURGEOLDEXTRACTS ./dirdat/*, USECHECKPOINTS, MINKEEPHOURS 12

启动MGR(源、目标都要操作)

sql 复制代码
GGSCI (test2 as goldengate@test1) 7> start mgr
Manager started.

2.2.3、源端添加表级附加日志

添加表级附加日志(源端)

sql 复制代码
GGSCI (test1) 3> dblogin userid ogg password ogg;
Successfully logged into database.

GGSCI (test1 as goldengate@test1) 13> add trandata CUX.CUX_TEMP_FOR_OGG_KEY
GGSCI (test1 as goldengate@test1) 13> info CUX.CUX_TEMP_FOR_OGG_KEY

注:当使用DDL抽取功能时需要启用用户级别附加日志或者表级附加日志,仅当表上有主键或唯一索引时,上述命令启用表级别附加日志。如果添加表级别的附加日志,表上没有主键或唯一索引时,必须指定一个或多个列作为主键,这些键的作用是过滤重复数据。

2.2.4、源端添加Extract进程

Extract进程配置文件

sql 复制代码
GGSCI (test1) 1> edit param ext_50
EXTRACT ext_50
SETENV (ORACLE_HOME="/u01/oraprod/db/tech_st/11.2.0")
SETENV (ORACLE_SID="prod")
SETENV (NLS_LANG="AMERICAN_AMERICA.AL32UTF8")
USERID ogg,PASSWORD ogg
TRANLOGOPTIONS MINEFROMACTIVEDG
LOGALLSUPCOLS
EXTTRAIL ./dirdat/50
TABLE CUX.CUX_TEMP_FOR_OGG_KEY;

添加抽取进程,并设置为马上开始:

sql 复制代码
GGSCI (test1) 2> add extract ext_50 tranlog, begin now
EXTRACT added.

指定Trail路径及名称,并绑定到Extract进程,设置队列文件大小1024M

sql 复制代码
add exttrail ./dirdat/50, extract ext_50, megabytes 1024

2.2.5、源端添加DataPump进程

编辑配置文件

sql 复制代码
GGSCI (test1 as goldengate@test1) 60> edit param pu_50
extract pu_50
passthru
dynamicresolution
userid ogg,password ogg
rmthost 192.168.158.152 mgrport 7809
RMTTRAIL ./dirdat/50
TABLE CUX.CUX_TEMP_FOR_OGG_KEY;

添加DataPump进程,指定使用Trail文件的位置

sql 复制代码
GGSCI (test1 as goldengate@test1) 61> add extract pu_50, exttrailsource ./dirdat/50

加Trail文件投递到目标端的位置,名称和大小

sql 复制代码
GGSCI (test1 as goldengate@test1) 62> add RMTTRAIL ./dirdat/50, extract pu_50, megabytes 1024

2.2.6、源端配置define文件

sql 复制代码
GGSCI (ambari.master.com) 3> edit param def_50
defsfile ./dirdef/def_50
userid ogg,password ogg
TABLE CUX.CUX_TEMP_FOR_OGG_KEY;

在OGG主目录下执行(oraprod用户):

sql 复制代码
./defgen paramfile dirprm/def_50.prm

将生成的/ogg/dirdef/ test_ogg发送的目标端ogg安装目录下的dirdef里:

sql 复制代码
scp -r /ogg/dirdef/def_50 root@192.168.158.152:/home/ogg/dirdef/

2.2.7、源端初始化进程

sql 复制代码
GGSCI> add extract exi_50, sourceistable
编辑参数:
GGSCI> EDIT PARAMS exi_50
EXTRACT exi_50 
USERID ogg,PASSWORD ogg
RMTHOST 192.168.158.152, MGRPORT 7809
RMTFILE ./dirdat/f0,maxfiles 999, megabytes 500
TABLE CUX.CUX_TEMP_FOR_OGG_KEY;

2.2.8、目标端配置kafka.props

sql 复制代码
cd /home/ogg/dirprm/
vi kafka_50.props

gg.handlerlist=kafkahandler
gg.handler.kafkahandler.type=kafka
gg.handler.kafkahandler.KafkaProducerConfigFile=custom_kafka_producer.properties
gg.handler.kafkahandler.topicMappingTemplate=CUX_TEMP_FOR_OGG_KEY
gg.handler.kafkahandler.format=json
gg.handler.kafkahandler.mode=op
gg.handler.kafkahandler.ProducerRecordClass=oracle.goldengate.handler.kafka.MyCreateProduceRecord  ##轮循
gg.classpath=dirprm/:/opt/kafka_2.11-0.11.0.3/libs/*:/home/ogg/:/home/ogg/lib/*

##仅首次配置需要
vi custom_kafka_producer.properties

bootstrap.servers=192.168.158.35:9092,192.168.158.36:9092,192.168.158.37:9092
acks=1
compression.type=gzip
reconnect.backoff.ms=1000
value.serializer=org.apache.kafka.common.serialization.ByteArraySerializer
key.serializer=org.apache.kafka.common.serialization.ByteArraySerializer
batch.size=102400
linger.ms=10000

2.2.9、目标端初始化进程

sql 复制代码
GGSCI (hadoop) 3> ADD replicat rpi_50, specialrun
REPLICAT added.

GGSCI (hadoop) 4> edit params rpi_50
添加下面配置:
SPECIALRUN
end runtime
setenv(NLS_LANG="AMERICAN_AMERICA.AL32UTF8")
targetdb libfile libggjava.so set property=./dirprm/kafka_50.props
SOURCEDEFS ./dirdef/def_50
EXTFILE ./dirdat/f0
reportcount every 1 minutes, rate
grouptransops 10000
map CUX.CUX_TEMP_FOR_OGG_KEY,target CUX.CUX_TEMP_FOR_OGG_KEY;

2.2.10、目标端配置replicate进程

sql 复制代码
GGSCI (ambari.slave1.com) 4> edit param rp_50
REPLICAT rp_50
sourcedefs ./dirdef/def_50
TARGETDB LIBFILE libggjava.so SET property=./dirprm/kafka_50.props
REPORTCOUNT EVERY 1 MINUTES, RATE 
GROUPTRANSOPS 10000
GETUPDATEBEFORES
GETUPDATEAFTERS
map CUX.CUX_TEMP_FOR_OGG_KEY,target CUX.CUX_TEMP_FOR_OGG_KEY;

GGSCI (ambari.slave1.com) 2> add replicat rp_50 exttrail ./dirdat/50,checkpointtable ogg.ggschkpt

3、数据同步

sql 复制代码
1、启动抽取进程(源端)
GGSCI> start ext_50
2、启动投递进程(源端)
GGSCI> start pu_50
3、启动初始化抽取进程(源端)
GGSCI> start exi_50
4、启动初始化应用进程(目标端)
cd /home/ogg/
./replicat paramfile ./dirprm/rpi_50.prm reportfile ./dirrpt/rpi_50.rpt -p INITIALDATALOAD
5、启动应用进程(目标端)
##指定进程begin时间,在初始化时间之前
GGSCI> alter replicat RP_50,begin 2023-05-29 11:30:00
GGSCI> start rp_50
相关推荐
远方16092 小时前
117-Oracle 26ai FILTER(过滤)子句新特性
大数据·数据库·sql·oracle·database
Maverick062 小时前
Oracle 归档日志(Archive Log)操作手册
数据库·oracle
isNotNullX2 小时前
一文讲清8大数据清洗方法
大数据·数据库·数据挖掘·数据迁移
Francek Chen2 小时前
【大数据存储与管理】分布式数据库HBase:05 HBase运行机制
大数据·数据库·hadoop·分布式·hdfs·hbase
小小怪7502 小时前
将Python Web应用部署到服务器(Docker + Nginx)
jvm·数据库·python
麦聪聊数据2 小时前
SQL 到 API 转化过程中的版本控制与灰度发布机制
数据库·sql·低代码·微服务
Coder-coco2 小时前
家政服务管理系统|基于springboot + vue家政服务管理系统(源码+数据库+文档)
java·数据库·vue.js·spring boot·论文·毕设·家政服务管理系统
人道领域2 小时前
Day | 07 【苍穹外卖 :用户端添加购物车】
java·开发语言·数据库·后端·苍穹外卖
lzp07912 小时前
Neo4j图数据库学习(二)——SpringBoot整合Neo4j
数据库·学习·neo4j