Oceanbase学习之一迁移mysql数据到oceanbase

一、数据库环境

#mysql环境

root@192.168.150.162 20:28: [(none)]> select version();

±----------+

| version() |

±----------+

| 8.0.26 |

±----------+

1 row in set (0.00 sec)

root@192.168.150.162 20:28: [(none)]> show variables like '%char%';

±-------------------------±----------------------------------+

| Variable_name | Value |

±-------------------------±----------------------------------+

| character_set_client | utf8mb4 |

| character_set_connection | utf8mb4 |

| character_set_database | utf8mb4 |

| character_set_filesystem | binary |

| character_set_results | utf8mb4 |

| character_set_server | utf8mb4 |

| character_set_system | utf8mb3 |

| character_sets_dir | /usr/local/mysql8/share/charsets/ |

±-------------------------±----------------------------------+

8 rows in set (0.00 sec)

#当前mysql环境下的表

#oceanbase环境

obclient [test]> select version();

±-----------------------------+

| version() |

±-----------------------------+

| 5.7.25-OceanBase_CE-v4.2.1.2 |

±-----------------------------+

1 row in set (0.002 sec)

obclient [test]> show variables like '%chara%';

±-------------------------±--------+

| Variable_name | Value |

±-------------------------±--------+

| character_set_client | utf8mb4 |

| character_set_connection | utf8mb4 |

| character_set_database | utf8mb4 |

| character_set_filesystem | binary |

| character_set_results | utf8mb4 |

| character_set_server | utf8mb4 |

| character_set_system | utf8mb4 |

±-------------------------±--------+

7 rows in set (0.005 sec)

确认mysql与oceanbase的字符集一样

二、mysqldump迁移数据到OceanBase

通过MySQL下的mysqldump将数据导出为SQL文本格式,将数据备份文件传输到OceanBase数据库主机后,通过source命令导入到OceanBase数据库。

#当前mysql下的表

MySQL [(none)]> use test;

Reading table information for completion of table and column names

You can turn off this feature to get a quicker startup with -A

Database changed

MySQL [test]> show tables;

±------------------+

| Tables_in_test |

±------------------+

| cluster_test |

| cluster_test1 |

| cluster_test2 |

| cluster_test3 |

| cluster_test4 |

| t1 |

| t2 |

| t8 |

| t_smallint |

| test_clustered |

| test_nonclustered |

±------------------+

11 rows in set (0.00 sec)

通过mysqldump导出数据

mysqldump -h 192.168.150.162 -uroot -P4000 -p --database test > test_oceanbase.sql

#传输脚本到oceanbase服务器

scp test_oceanbase.sql 192.168.150.116:/home/admin

#oceanbase导入

obclient [test]> source test_oceanbase.sql

obclient [test]> show tables;

±------------------+

| Tables_in_test |

±------------------+

| cluster_test |

| cluster_test1 |

| cluster_test2 |

| cluster_test3 |

| cluster_test4 |

| t1 |

| t2 |

| t8 |

| t_smallint |

| test_clustered |

| test_nonclustered |

±------------------+

11 rows in set (0.004 sec)

#抽查表和数据已经导入

obclient [test]> select * from big

-> ;

±---------------------±--------------------+

| id | id1 |

±---------------------±--------------------+

| 18446744073709551615 | 9223372036854775807 |

±---------------------±--------------------+

1 row in set (0.003 sec)

三、通过datax从MySQL离线导入数据到OceanBase

#datax部署安装

datax 下载地址:https://datax-opensource.oss-cn-hangzhou.aliyuncs.com/202210/datax.tar.gz

1、直接服务器上下载datax

wget https://datax-opensource.oss-cn-hangzhou.aliyuncs.com/202210/datax.tar.gz

2、解压datax

admin@localhost \~\]$ tar zxvf datax.tar.gz **3、安装java** yum install java **4、测试datax是否安装成功** \[admin@localhost bin\]$ python datax.py .../job/job.json DataX (DATAX-OPENSOURCE-3.0), From Alibaba ! Copyright © 2010-2017, Alibaba Group. All Rights Reserved. 2023-12-21 23:22:50.245 \[main\] INFO MessageSource - JVM TimeZone: GMT+08:00, Locale: zh_CN 2023-12-21 23:22:50.248 \[main\] INFO MessageSource - use Locale: zh_CN timeZone: sun.util.calendar.ZoneInfo\[id="GMT+08:00",offset=28800000,dstSavings=0,useDaylight=false,transitions=0,lastRule=null

2023-12-21 23:22:50.307 [main] INFO VMInfo - VMInfo# operatingSystem class => sun.management.OperatingSystemImpl

2023-12-21 23:22:50.312 [main] INFO Engine - the machine info =>

osInfo: Red Hat, Inc. 1.8 25.392-b08

jvmInfo: Linux amd64 3.10.0-1160.el7.x86_64

cpu num: 8

totalPhysicalMemory: -0.00G

freePhysicalMemory: -0.00G

maxFileDescriptorCount: -1

currentOpenFileDescriptorCount: -1

GC Names [PS MarkSweep, PS Scavenge]

MEMORY_NAME | allocation_size | init_size

PS Eden Space | 256.00MB | 256.00MB

Code Cache | 240.00MB | 2.44MB

Compressed Class Space | 1,024.00MB | 0.00MB

PS Survivor Space | 42.50MB | 42.50MB

PS Old Gen | 683.00MB | 683.00MB

Metaspace | -0.00MB | 0.00MB

2023-12-21 23:22:50.329 [main] INFO Engine -

{

"content":[

{

"reader":{

"name":"streamreader",

"parameter":{

"column":[

{

"type":"string",

"value":"DataX"

},

{

"type":"long",

"value":19890604

},

{

"type":"date",

"value":"1989-06-04 00:00:00"

},

{

"type":"bool",

"value":true

},

{

"type":"bytes",

"value":"test"

}

],

"sliceRecordCount":100000

}

},

"writer":{

"name":"streamwriter",

"parameter":{

"encoding":"UTF-8",

"print":false

}

}

}

],

"setting":{

"errorLimit":{

"percentage":0.02,

"record":0

},

"speed":{

"channel":1

}

}

}

2023-12-21 23:22:50.348 [main] WARN Engine - prioriy set to 0, because NumberFormatException, the value is: null

2023-12-21 23:22:50.361 [main] INFO PerfTrace - PerfTrace traceId=job_-1, isEnable=false, priority=0

2023-12-21 23:22:50.361 [main] INFO JobContainer - DataX jobContainer starts job.

2023-12-21 23:22:50.365 [main] INFO JobContainer - Set jobId = 0

2023-12-21 23:22:50.402 [job-0] INFO JobContainer - jobContainer starts to do prepare ...

2023-12-21 23:22:50.402 [job-0] INFO JobContainer - DataX Reader.Job [streamreader] do prepare work .

2023-12-21 23:22:50.402 [job-0] INFO JobContainer - DataX Writer.Job [streamwriter] do prepare work .

2023-12-21 23:22:50.402 [job-0] INFO JobContainer - jobContainer starts to do split ...

2023-12-21 23:22:50.403 [job-0] INFO JobContainer - Job set Channel-Number to 1 channels.

2023-12-21 23:22:50.403 [job-0] INFO JobContainer - DataX Reader.Job [streamreader] splits to [1] tasks.

2023-12-21 23:22:50.403 [job-0] INFO JobContainer - DataX Writer.Job [streamwriter] splits to [1] tasks.

2023-12-21 23:22:50.421 [job-0] INFO JobContainer - jobContainer starts to do schedule ...

2023-12-21 23:22:50.426 [job-0] INFO JobContainer - Scheduler starts [1] taskGroups.

2023-12-21 23:22:50.429 [job-0] INFO JobContainer - Running by standalone Mode.

2023-12-21 23:22:50.438 [taskGroup-0] INFO TaskGroupContainer - taskGroupId=[0] start [1] channels for [1] tasks.

2023-12-21 23:22:50.442 [taskGroup-0] INFO Channel - Channel set byte_speed_limit to -1, No bps activated.

2023-12-21 23:22:50.442 [taskGroup-0] INFO Channel - Channel set record_speed_limit to -1, No tps activated.

2023-12-21 23:22:50.468 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[0] attemptCount[1] is started

2023-12-21 23:22:50.789 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[0] is successed, used[335]ms

2023-12-21 23:22:50.790 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] completed it's tasks.

2023-12-21 23:23:00.452 [job-0] INFO StandAloneJobContainerCommunicator - Total 100000 records, 2600000 bytes | Speed 253.91KB/s, 10000 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.046s | All Task WaitReaderTime 0.056s | Percentage 100.00%

2023-12-21 23:23:00.453 [job-0] INFO AbstractScheduler - Scheduler accomplished all tasks.

2023-12-21 23:23:00.453 [job-0] INFO JobContainer - DataX Writer.Job [streamwriter] do post work.

2023-12-21 23:23:00.453 [job-0] INFO JobContainer - DataX Reader.Job [streamreader] do post work.

2023-12-21 23:23:00.453 [job-0] INFO JobContainer - DataX jobId [0] completed successfully.

2023-12-21 23:23:00.454 [job-0] INFO HookInvoker - No hook invoked, because base dir not exists or is a file: /home/admin/datax/hook

2023-12-21 23:23:00.455 [job-0] INFO JobContainer -

total cpu info\] =\> averageCpu \| maxDeltaCpu \| minDeltaCpu -1.00% \| -1.00% \| -1.00% \[total gc info\] =\> NAME \| totalGCCount \| maxDeltaGCCount \| minDeltaGCCount \| totalGCTime \| maxDeltaGCTime \| minDeltaGCTime PS MarkSweep \| 0 \| 0 \| 0 \| 0.000s \| 0.000s \| 0.000s PS Scavenge \| 0 \| 0 \| 0 \| 0.000s \| 0.000s \| 0.000s 2023-12-21 23:23:00.455 \[job-0\] INFO JobContainer - PerfTrace not enable! 2023-12-21 23:23:00.456 \[job-0\] INFO StandAloneJobContainerCommunicator - Total 100000 records, 2600000 bytes \| Speed 253.91KB/s, 10000 records/s \| Error 0 records, 0 bytes \| All Task WaitWriterTime 0.046s \| All Task WaitReaderTime 0.056s \| Percentage 100.00% 2023-12-21 23:23:00.457 \[job-0\] INFO JobContainer - 任务启动时刻 : 2023-12-21 23:22:50 任务结束时刻 : 2023-12-21 23:23:00 任务总计耗时 : 10s 任务平均流量 : 253.91KB/s 记录写入速度 : 10000rec/s 读出记录总数 : 100000 读写失败总数 : 0 **5** 、**创建datax-job的json** **{** **"job": {** **"entry": {** **"jvm": "-Xms1024m -Xmx1024m"** **},** **"setting": {** **"speed": {** **"channel": 4** **},** **"errorLimit": {** **"record": 0,** **"percentage": 0.1** **}** **},** **"content": \[{** **"reader": {** **"name": "mysqlreader",** **"parameter": {** **"username": "root",** **"password": "oracle123",** **"column": \[** **"\*"** **\],** **"connection": \[{** **"table": \[** **"Tab_A"** **\],** **"jdbcUrl": \["jdbc:mysql://192.168.150.162:4000/test?useUnicode=true\&characterEncoding=utf8\&useSSL=false"\]** **}\]** **}** **},** **"writer": {** **"name": "oceanbasev10writer",** **"parameter": {** **"obWriteMode": "insert",** **"column": \[** **"\*"** **\],** **"preSql": \[** **"truncate table Tab_A"** **\],** **"connection": \[{** **"jdbcUrl": "\|\|_dsc_ob10_dsc_\|\|obdemo:obmysql\|\|_dsc_ob10_dsc_\|\|jdbc:oceanbase://192.168.150.116:2883/test?useLocalSessionState=true\&allowBatch=true\&allowMultiQueries=true\&rewriteBatchedStatements=true",** **"table": \[** **"Tab_A"** **\]** **}\],** **"username": "root",** **"password": "oracle123",** **"writerThreadCount": 10,** **"batchSize": 1000,** **"memstoreThreshold": "0.9"** **}** **}** **}\]** **}** **}** **6、执行离线数据同步** **源端数据:** MySQL \[test\]\> select \* from Tab_A; ±---±-----±-----±-----±-----±-----±-------+ \| id \| bid \| cid \| name \| type \| num \| amt \| ±---±-----±-----±-----±-----±-----±-------+ \| 1 \| 1 \| 1 \| A01 \| 01 \| 111 \| 111.00 \| \| 2 \| 2 \| 2 \| A01 \| 01 \| 112 \| 111.00 \| \| 3 \| 3 \| 3 \| A02 \| 02 \| 113 \| 111.00 \| \| 4 \| 4 \| 4 \| A02 \| 02 \| 112 \| 111.00 \| \| 5 \| 5 \| 5 \| A01 \| 01 \| 111 \| 111.00 \| \| 6 \| 6 \| 6 \| A02 \| 02 \| 113 \| 111.00 \| \| 7 \| 5 \| 7 \| A01 \| 01 \| 111 \| 88.00 \| \| 8 \| 6 \| 8 \| A02 \| 02 \| 113 \| 88.00 \| ±---±-----±-----±-----±-----±-----±-------+ 8 rows in set (0.26 sec) 目标数据: obclient \[test\]\> select \* from Tab_A; Empty set (0.133 sec) 执行同步: python ./datax.py .../job/mysql2ob.json 2023-12-22 00:42:13.745 \[job-0\] INFO JobContainer - PerfTrace not enable! 2023-12-22 00:42:13.745 \[job-0\] INFO StandAloneJobContainerCommunicator - Total 8 records, 134 bytes \| Speed 13B/s, 0 records/s \| Error 0 records, 0 bytes \| All Task WaitWriterTime 0.020s \| All Task WaitReaderTime 0.000s \| Percentage 100.00% 2023-12-22 00:42:13.747 \[job-0\] INFO JobContainer - 任务启动时刻 : 2023-12-22 00:41:54 任务结束时刻 : 2023-12-22 00:42:13 任务总计耗时 : 19s 任务平均流量 : 13B/s 记录写入速度 : 0rec/s 读出记录总数 : 8 读写失败总数 : 0 7、检查数据: obclient \[test\]\> select \* from Tab_A; ±---±-----±-----±-----±-----±-----±-------+ \| id \| bid \| cid \| name \| type \| num \| amt \| ±---±-----±-----±-----±-----±-----±-------+ \| 1 \| 1 \| 1 \| A01 \| 01 \| 111 \| 111.00 \| \| 2 \| 2 \| 2 \| A01 \| 01 \| 112 \| 111.00 \| \| 3 \| 3 \| 3 \| A02 \| 02 \| 113 \| 111.00 \| \| 4 \| 4 \| 4 \| A02 \| 02 \| 112 \| 111.00 \| \| 5 \| 5 \| 5 \| A01 \| 01 \| 111 \| 111.00 \| \| 6 \| 6 \| 6 \| A02 \| 02 \| 113 \| 111.00 \| \| 7 \| 5 \| 7 \| A01 \| 01 \| 111 \| 88.00 \| \| 8 \| 6 \| 8 \| A02 \| 02 \| 113 \| 88.00 \| ±---±-----±-----±-----±-----±-----±-------+ 8 rows in set (0.002 sec)

相关推荐
MMMMMMMMMMemory19 小时前
社区版oceanbase报警XA事务悬挂
数据库·oceanbase
OceanBase数据库官方博客19 小时前
APQO自适应参数化查询优化框架——OceanBase 校企联合研究成果
数据库·oceanbase·分布式数据库
OceanBase数据库官方博客19 小时前
中国联通软研院基于OceanBase引领运营商数智化转型新范式
数据库·oceanbase·分布式数据库
OceanBase数据库官方博客2 天前
滔搏基于OceanBase实现 15TB到0.9TB“无痛切换”与“系统瘦身”
数据库·oceanbase·分布式数据库
OceanBase数据库官方博客2 天前
爱奇艺基于OceanBase实现百亿级卡券业务的“单库双擎”架构升级
数据库·架构·oceanbase·分布式数据库
OceanBase数据库官方博客3 天前
主流关系型数据库系统缺陷实证研究——OceanBase 校企联合研究
数据库·oceanbase·分布式数据库
OceanBase数据库官方博客3 天前
客户案例|美的以OceanBase为基构建云中立数字化基座破局多云孤岛
数据库·oceanbase·分布式数据库
OceanBase数据库官方博客3 天前
基于分层协作多智能体的数据库参数调优——OceanBase 校企研究
数据库·oceanbase·分布式数据库
OceanBase数据库5 天前
滔搏基于OceanBase实现 15TB到0.9TB“无痛切换”与“系统瘦身”
oceanbase·分布式数据库
OceanBase数据库6 天前
基于分层协作多智能体的数据库参数调优——OceanBase 校企研究
oceanbase·分布式数据库