OGG初始化技术研究
Oracle Golden Gate 一个强大的数据同步工具,可以实现不同数据库类型之间的同步,Oracle to Oracle,Oracle to MySQL,MySQL to Oracle,sqlserver to Oracle等,实现同步的过程的之前需要对源端和目标端的数据进行初始化,才能确保源端和目标端的数据保持一致。
一次在公司的项目中,进行初始化数据的时候发现整个数据库并不大,但是同步的时候且很慢很耗时,后来检查发现该库存在较多的LOB字段,破天荒的有55个LOB字段,真的怀疑这个数据库设计团队是刚毕业的吗。初始化数据整得我够呛。有过这次的经验教训之后,在网上找到LOB字段的初始化技术进行测试研究,并对OGG的其他同步技术也一起进行了一次测试研究。
最后也提到了初始化的性能优化方法建议。
总体大概测试了一下几个方法:
Goldengate : File To Database Utility - Initial Load Techniques
Goldengate : Direct Bulk Load to SQL*Loader - Initial Load Techniques
Goldengate : File to Replicat - Initial Load Techniques
Goldengate : Direct Load - Initial Load Techniques
Goldengate : Loading Data from Trail to Replicat
以及数据泵初始化
话不多说,下面进行案例测试:
---------------------------------------- START ----------------------------------------
Goldengate : Direct Load - Initial Load Techniques [1457164.1]
不支持LOB字段
###可以在提取或应用期间完成转换和映射,但这种初始加载方法不支持具有 LOB、LONG 和 UDT(用户定义的数据类型)的表,以及大于 4K 的任何其他数据类型。
ON source:
ADD EXTRACT INITLD, SOURCEISTABLE
EDIT PARAMS INITLD
EXTRACT INITLD
USERID ogg, password oracle
RMTHOST 192.168.56.100, MGRPORT 7860
RMTTASK REPLICAT, GROUP initrep
TABLEEXCLUDE cfpsng.t_msg2;
TABLE cfpsng.*;
ON target:
ADD REPLICAT initrep, SPECIALRUN
EDIT PARAMS initrep
REPLICAT initrep
USERID ogg, password oracle
ASSUMETARGETDEFS
MAP cfpsng.*, TARGET cfpsng.*;
on source:
#先启动抽取进程
START EXTRACT <extract>
#再启动初始化抽取进程
START <initial-load Extract>
start INITLD
ON target:
START REPLICAT <Replicat>
Once it is confirmed, turn off HANDLECOLLISIONS, on the target system.
SEND REPLICAT <Replicat>, NOHANDLECOLLISIONS
接着编辑配置文件去掉 HANDLECOLLISIONS
HANDLECOLLISIONS 是一个 Replicat 参数,主要用于初始加载期间。它允许您继续处理跟踪中的数据,即使目标环境中存在潜在的数据完整性问题,例如缺少更新行、缺少删除行或重复插入。
When the HANDLECOLLISIONS parameter is set, data is processed as follows : -
设置 HANDLECOLLISIONS 参数后,将按如下方式处理数据: -
Missing updates are ignored.
缺失的更新将被忽略。
Missing deletes are ignored.
缺失的删除将被忽略。
Duplicate inserts are turned into updates.
重复的插入将转换为更新。
HANDLECOLLISIONS 的主要用途是在初始加载后启动 Replicat,在此期间,源用户应用程序保持联机状态,GoldenGate 捕获其更改。HANDLECOLLISIONS 使用加载所做的更改来解决这些正在进行的更改。当您在轨迹中向后重新定位 Replicat 以解决其他问题时,也可以使用 HANDLECOLLISIONS。
---------------------------------------- END ----------------------------------------
---------------------------------------- START ----------------------------------------
Goldengate : File to Replicat - Initial Load Techniques [1441172.1]
#### 带LOB字段的表初始化数据的方法,这是一种相当慢的初始加载方法
####这是一种相当慢的初始加载方法,仅当要加载的数据量不是太大并且由于列大小/数据类型限制等限制而无法使用其他方法时,才建议使用此方法
https://www.oracle-scn.com/oracle-goldengate-initial-load-file-to-replicat-method/
Oracle GoldenGate -- Initial Load -- File to Replicat Method
#初始化表结构信息
expdp \'/ as sysdba\' tables=cfpsng.dbatab CONTENT=METADATA_ONLY DUMPFILE=dbatabmeta.dmp logfile=exp20241129.log
impdp \'/ as sysdba\' tables=cfpsng.dbatab DUMPFILE=dbatabmeta.dmp logfile=imp20241129.log
#源端增加进程
on SOURCE :
ADD EXTRACT EXTLOAD, SOURCEISTABLE
edit param EXTLOAD
EXTRACT EXTLOAD
SOURCEISTABLE
USERID ogg, password oracle
RMTHOST 192.168.56.100, MGRPORT 7860
RMTFILE /ogg/dirdat/in, PURGE
TABLE cfpsng.t_msg_content;
#目标端增加进程
on TARGET:
ADD REPLICAT REPLOAD, SPECIALRUN
edit param REPLOAD
REPLICAT REPLOAD
SPECIALRUN
USERID ogg, password oracle
EXTFILE /ogg/dirdat/in
DISCARDFILE /ogg/dirrpt/rep.dsc, PURGE
ASSUMETARGETDEFS
MAP cfpsng.t_msg_content, TARGET cfpsng.t_msg_content;
END RUNTIME
#启动源端进程
---------------------------------------- END ----------------------------------------
---------------------------------------- START ----------------------------------------
### OGG 初始化数据的方法 支持LOB字段的初始化
Oracle GoldenGate -- Initial Load -- File to Database Utility Method
on SOURCE:
ADD EXTRACT extload3, SOURCEISTABLE
dblogin USERID ogg, password oracle
delete EXTRACT extload3
edit param extload3
EXTRACT extload3
SOURCEISTABLE
USERID ogg, password oracle
RMTHOST 192.168.56.100, MGRPORT 7860
FORMATASCII, SQLLOADER
RMTFILE /ogg/dirdat/init.dat
TABLE cfpsng.t_msg_content;
on TARGET:
dblogin USERID ogg, password oracle
add checkpointtable ogg.cktab
ADD REPLICAT repload3,EXTFILE /ogg/dirdat/init.dat,checkpointtable ogg.cktab
dblogin USERID ogg, password oracle
delete REPLICAT repload3
edit param repload3
REPLICAT repload3
GENLOADFILES sqlldr.tpl
USERID ogg, password oracle
EXTFILE /ogg/dirdat/init.dat
ASSUMETARGETDEFS
MAP cfpsng.t_msg_content, TARGET cfpsng.t_msg_content;
---------------------------------------- END ----------------------------------------
---------------------------------------- START ----------------------------------------
Goldengate : File To Database Utility - Initial Load Techniques [1457989.1]
再开始初始化数据的时候,需要先把源端的抽取进程启动。在初始化完成之后,目标端启动应用进程即可。
ON SOURCE:
1、编辑初始化抽取进程参数
-- eil_utl.prm
-- Initial- Load Extract : File to Database Utility for SRC.DEPT and SRC.EMP, genarate a different file per table.
-- if the file already exist overwrite them (PURGE)
-- FORMATASCII must be positioned before RMTFILE
vi /ogg/dirprm/eil_utl.prm
SOURCEISTABLE
USERID ogg, password oracle
RMTHOST 192.168.56.100, MGRPORT 7860
FORMATASCII, SQLLOADER
RMTFILE ./DBATAB.dat, PURGE
TABLE CFPSNG.DBATAB;
RMTFILE ./T_MSG_CONTENT.dat, PURGE
TABLE CFPSNG.T_MSG_CONTENT;
cd /ogg
./extract paramfile ./dirprm/eil_utl.prm reportfile ./dirrpt/eil_utl.rpt
确认eil_utl.rpt文件
[oracle@oradb ogg]$ ls -l ./dirrpt/eil_utl.rpt
-rw-rw-rw- 1 oracle oinstall 4852 Dec 2 23:54 ./dirrpt/eil_utl.rpt
查看文件内容:
***********************************************************************
* ** Run Time Statistics ** *
***********************************************************************
Report at 2024-12-02 23:54:43 (activity since 2024-12-02 23:54:32)
Output to ./DBATAB.dat:
From Table CFPSNG.DBATAB:
# inserts: 2809
# updates: 0
# deletes: 0
# discards: 0
Output to ./T_MSG_CONTENT.dat:
From Table CFPSNG.T_MSG_CONTENT:
# inserts: 10000
# updates: 0
# deletes: 0
# discards: 0
REDO Log Statistics
Bytes parsed 0
Bytes output 45648668
#ON TARGET
-- ril_ut1.prm
-- GG Initial-Load Replicat from File to Db Utility
vi /ogg/dirprm/ril_utl.prm
GENLOADFILES sqlldr.tpl
USERID ogg, password oracle
EXTFILE ./DBATAB.dat, PURGE
EXTFILE ./T_MSG_CONTENT.dat, PURGE
ASSUMETARGETDEFS
MAP CFPSNG.*, TARGET CFPSNG.*;
./replicat paramfile ./dirprm/ril_utl.prm reportfile ./dirrpt/ril_utl.rpt
确认文件[oracle@dmdb2 ogg]$ ls -l ./dirrpt/ril_utl.rpt
-rw-rw-rw- 1 oracle oinstall 3034 Dec 3 00:01 ./dirrpt/ril_utl.rpt
执行完之后在/ogg目录会生成一个.run运行文件和 .ctl的控制文件
[oracle@dmdb2 ogg]$ ls -l DBATAB*
-rw-rw-rw- 1 oracle oinstall 5774 Dec 3 00:01 DBATAB.ctl
-rw-rw-rw- 1 oracle oinstall 4078668 Dec 2 23:54 DBATAB.dat
-rwxrw-rw- 1 oracle oinstall 63 Dec 3 00:01 DBATAB.run
[oracle@dmdb2 ogg]$ ls -l T_MSG_CONTENT.*
-rw-rw-rw- 1 oracle oinstall 497 Dec 3 00:01 T_MSG_CONTENT.ctl
-rw-rw-rw- 1 oracle oinstall 41570000 Dec 2 23:54 T_MSG_CONTENT.dat
-rwxrw-rw- 1 oracle oinstall 77 Dec 3 00:01 T_MSG_CONTENT.run
接着执行.run文件初始化数据
[oracle@dmdb2 ogg]$ ./DBATAB.run
SQL*Loader: Release 11.2.0.4.0 - Production on Tue Dec 3 00:06:48 2024
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
SQL*Loader-941: Error during describe of table DBATAB
ORA-04043: object DBATAB does not exist
[oracle@dmdb2 ogg]$ ./DBATAB.run
SQL*Loader: Release 11.2.0.4.0 - Production on Tue Dec 3 00:08:12 2024
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
Load completed - logical record count 2809.
[oracle@dmdb2 ogg]$ ./T_MSG_CONTENT.run
SQL*Loader: Release 11.2.0.4.0 - Production on Tue Dec 3 00:08:17 2024
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
Load completed - logical record count 10000.
最后启动目标端的应用进程开始增量同步。
---------------------------------------- END ----------------------------------------
---------------------------------------- START ----------------------------------------
Goldengate : File To Database Utility - Initial Load Techniques [1457989.1]
支持LOB字段的初始化
#测试使用通配符
再开始初始化数据的时候,需要先把源端的抽取进程启动。在初始化完成之后,目标端启动应用进程即可。
ON SOURCE:
1、编辑初始化抽取进程参数
-- eil_utl.prm
-- Initial- Load Extract : File to Database Utility for SRC.DEPT and SRC.EMP, genarate a different file per table.
-- if the file already exist overwrite them (PURGE)
-- FORMATASCII must be positioned before RMTFILE
vi /ogg/dirprm/eil_utl2.prm
SOURCEISTABLE
USERID ogg, password oracle
RMTHOST 192.168.56.100, MGRPORT 7860
FORMATASCII, SQLLOADER
RMTFILE ./DBATAB.dat, PURGE
RMTFILE ./T_MSG_CONTENT.dat, PURGE
TABLEEXCLUDE CFPSNG.T_MSG2;
TABLE CFPSNG.*;
注意:
--TABLEEXCLUDE必须放在TABLE的前面,否则被过滤的表也会导出,如下日志显示
cd /ogg
./extract paramfile ./dirprm/eil_utl2.prm reportfile ./dirrpt/eil_utl2.rpt
执行完成之后可以检查eil_utl2.rpt日志显示导出的数据,同时在目标端/ogg目录下也有CDC.DAT文件存在。
确认eil_utl2.rpt文件
[oracle@oradb ogg]$ ls -l ./dirrpt/eil_utl2.rpt
-rw-rw-rw- 1 oracle oinstall 5420 Dec 3 00:21 ./dirrpt/eil_utl2.rpt
查看文件内容:
***********************************************************************
* ** Run Time Statistics ** *
***********************************************************************
Report at 2024-12-03 00:21:22 (activity since 2024-12-03 00:21:07)
Output to ./*.dat:
From Table CFPSNG.DBATAB:
# inserts: 2809
# updates: 0
# deletes: 0
# discards: 0
From Table CFPSNG.T_MSG2:
# inserts: 10000
# updates: 0
# deletes: 0
# discards: 0
From Table CFPSNG.T_MSG_CONTENT:
# inserts: 10000
# updates: 0
# deletes: 0
# discards: 0
REDO Log Statistics
Bytes parsed 0
Bytes output 87218668
***********************************************************************
* ** Run Time Statistics ** *
***********************************************************************
Report at 2024-12-03 00:27:49 (activity since 2024-12-03 00:27:35)
Output to ./CDC.dat:
From Table CFPSNG.T_MSG2:
# inserts: 10000
# updates: 0
# deletes: 0
# discards: 0
From Table CFPSNG.T_MSG_CONTENT:
# inserts: 10000
# updates: 0
# deletes: 0
# discards: 0
REDO Log Statistics
Bytes parsed 0
Bytes output 83140000
***********************************************************************
* ** Run Time Statistics ** *
***********************************************************************
Report at 2024-12-03 00:45:30 (activity since 2024-12-03 00:45:20)
Output to ./CDC.dat:
From Table CFPSNG.DBATAB:
# inserts: 2809
# updates: 0
# deletes: 0
# discards: 0
From Table CFPSNG.T_MSG_CONTENT:
# inserts: 10000
# updates: 0
# deletes: 0
# discards: 0
REDO Log Statistics
Bytes parsed 0
Bytes output 45648668
#ON TARGET
-- ril_ut1.prm
-- GG Initial-Load Replicat from File to Db Utility
vi /ogg/dirprm/ril_utl2.prm
GENLOADFILES sqlldr.tpl
USERID ogg, password oracle
EXTFILE ./DBATAB.dat, PURGE
EXTFILE ./T_MSG_CONTENT.dat, PURGE
ASSUMETARGETDEFS
MAPEXCLUDE CFPSNG.T_MSG2;
MAP CFPSNG.*, TARGET CFPSNG.*;
./replicat paramfile ./dirprm/ril_utl2.prm reportfile ./dirrpt/ril_utl2.rpt
执行完之后目录下生成了一下文件(这一步相对较快)
执行完之后在/ogg目录会生成一个.run运行文件和 .ctl的控制文件
[oracle@dmdb2 ogg]$ ls -ltr *.run
-rwxrw-rw- 1 oracle oinstall 63 Dec 3 00:47 DBATAB.run
-rwxrw-rw- 1 oracle oinstall 77 Dec 3 00:47 T_MSG_CONTENT.run
[oracle@dmdb2 ogg]$ ls -ltr *.ctl
-rw-rw-rw- 1 oracle oinstall 5774 Dec 3 00:47 DBATAB.ctl
-rw-rw-rw- 1 oracle oinstall 497 Dec 3 00:47 T_MSG_CONTENT.ctl
检查报告文件内容:
[oracle@dmdb2 ogg]$ ls -l ./dirrpt/ril_utl2.rpt
-rw-rw-rw- 1 oracle oinstall 2332 Dec 3 00:35 ./dirrpt/ril_utl2.rpt
接着执行.run文件初始化数据
导入失败,并生成了一个CDC.bad的文件
[oracle@dmdb2 ogg]$ ./DBATAB.run
SQL*Loader: Release 11.2.0.4.0 - Production on Tue Dec 3 00:50:39 2024
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
Load completed - logical record count 3088.
[oracle@dmdb2 ogg]$ ./T_MSG_CONTENT.run
SQL*Loader: Release 11.2.0.4.0 - Production on Tue Dec 3 00:51:02 2024
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
Load completed - logical record count 722.
[oracle@dmdb2 ogg]$ ls -l CDC.bad
-rw-r--r-- 1 oracle oinstall 74052 Dec 3 00:51 CDC.bad
这里需要修改ctl文件里面的dat文件名字,因为前面统一指定的文件名字是CDC.DAT,再次测试分开RMTFILE名字
最后启动目标端的应用进程开始增量同步。
---------------------------------------- END ----------------------------------------
---------------------------------------- START ----------------------------------------
Goldengate : Direct Bulk Load to SQL*Loader - Initial Load Techniques [1461851.1]
##此方法不支持具有 LOB 或 LONG 数据类型的表,您可以使用 File to Replicat 或 File to Database Utility 作为替代的初始加载方法。
##也不支持具有 LOB 的具体化视图。
##此初始加载方法也不支持数据加密。
ON source:
DBLOGIN USERID ogg, PASSWORD oracle
ADD EXTRACT EIL_BULK, SOURCEISTABLE
EDIT PARAMS eil_bulk
--
-- eil_bulk.prm
-- Initial- Load Extract : Bulkload to sqlldr
--
EXTRACT eil_bulk
SOURCEISTABLE
USERID ogg, PASSWORD oracle
RMTHOST 192.168.56.100, MGRPORT 7860
RMTTASK REPLICAT, GROUP ril_bulk
TABLEEXCLUDE cfpsng.t_msg2;
TABLE cfpsng.*;
ON target:
ADD REPLICAT RIL_BULK, SPECIALRUN
EDIT PARAMS RIL_BULK
--
-- ril_bulk.prm
-- Direct Bulk Load for schema TGT.*
--
REPLICAT ril_bulk
BULKLOAD
USERID ogg, PASSWORD oracle
ASSUMETARGETDEFS
SOURCEDEFS ./dirsql/locations.sql
MAP cfpsng.*, TARGET cfpsng.*;
On source:
START EXTRACT EIL_BULK
注意:该方法不支持LOB字段。除去LOB字段再次尝试。
2024-12-03 07:50:52 ERROR OGG-01192 Trying to use RMTTASK on data types which may be written as LOB chunks (Table: 'CFPSNG.T_MSG_CO
NTENT').
On source:
执行完成后检查日志:
[oracle@oradb dirrpt]$ cat EIL_BULK.rpt
***********************************************************************
* ** Run Time Statistics ** *
***********************************************************************
Report at 2024-12-03 07:56:34 (activity since 2024-12-03 07:56:25)
Output to ril_bulk:
From Table CFPSNG.DBATAB:
# inserts: 2809
# updates: 0
# deletes: 0
# discards: 0
REDO Log Statistics
Bytes parsed 0
Bytes output 1973790
On target:
[oracle@dmdb2 dirrpt]$ cat RIL_BULK.rpt
***********************************************************************
* ** Run Time Statistics ** *
***********************************************************************
Report at 2024-12-03 07:56:51 (activity since 2024-12-03 07:56:45)
From Table CFPSNG.DBATAB to CFPSNG.DBATAB:
# inserts: 2809
# updates: 0
# deletes: 0
# discards: 0
---------------------------------------- END ----------------------------------------
---------------------------------------- START ----------------------------------------
Loading Data from Trail to Replicat
将数据从 Trail 加载到 Replicat
Using this method of initial load, an initial-load Extract writes the records to a series of external files which Replicat uses to apply the changes to the target site.
使用这种初始加载方法,初始加载 Extract 会将记录写入一系列外部文件,Replicat 使用这些文件将更改应用于目标站点。
This is faster than the previous method because replication can begin while extraction is already started.
这比以前的方法更快,因为可以在提取已开始时开始复制。
It can also process a near unlimited amount of tables.
它还可以处理几乎无限数量的表。
For detail information and an example, refer to Document 1195705.1 How to initial load files/tables larger than 2 gig using rmtfile
有关详细信息和示例,请参阅文档 1195705.1如何使用 rmtfile 初始加载大于 2 GB 的文件/表
---------------------------------------- END ----------------------------------------
---------------------------------------- START ----------------------------------------
##初始加载的性能优化
Parallel Processing 并行处理
Parallel GoldenGate processes can be used for all methods of initial-load except those performed with a Database Utility.
并行 GoldenGate 进程可用于所有初始加载方法,但使用 Database Utility 执行的方法除外。
You can use multiple Replicat processes with multiple Extract processes in parallel to increase throughput. To differentiate among Replicat processes, you assign each one a group name
您可以并行使用多个 Replicat 进程和多个 Extract 进程来提高吞吐量。要区分 Replicat 进程,请为每个进程分配一个组名称
It is recommended to group tables that are related, ie. via referential integrity constraints, within the same group.
建议对相关的表进行分组,即。通过引用完整性约束,在同一组内。
TABLE and MAP parameters can be used to specify a different set of tables for each pair of Extract/Replicat, ie ext1/ext2 and corresponding rep1/rep2).
TABLE 和 MAP 参数可用于为每对 Extract/Replicate 指定一组不同的表,即 ext1/ext2 和相应的 rep1/rep2)。
Another technique is to use the RANGE function to split the rows into equal buckets, GoldenGate manages this by using a hash algorithm on the key values.
另一种技术是使用 RANGE 函数将行拆分为相等的桶,GoldenGate 通过对键值使用哈希算法来管理这一点。
For instance, split the table into 3 parallel processes:
例如,将表拆分为 3 个并行进程:
EXT1
TABLE HR.PRODUCTS ,FILTER (@RANGE (1,3)
表 HR.产品,带滤芯 (@RANGE (1,3)
EXT2
TABLE HR.PRODUCTS ,FILTER (@RANGE (2,3)
表 HR.产品,带滤芯 (@RANGE (2,3)
EXT3
TABLE HR.PRODUCTS ,FILTER (@RANGE (3,3)
表 HR.产品,带滤芯 (@RANGE (3,3)
Refer to Document 1289341.1 Guidance on the best way to use RANGE function with EXTRACT.
请参阅文档 1289341.1 指南,了解将 RANGE 函数与 EXTRACT 结合使用的最佳方法。