Oceanbase学习之一迁移mysql数据到oceanbase

一、数据库环境

#mysql环境

root@192.168.150.162 20:28: [(none)]> select version();

±----------+

| version() |

±----------+

| 8.0.26 |

±----------+

1 row in set (0.00 sec)

root@192.168.150.162 20:28: [(none)]> show variables like '%char%';

±-------------------------±----------------------------------+

| Variable_name | Value |

±-------------------------±----------------------------------+

| character_set_client | utf8mb4 |

| character_set_connection | utf8mb4 |

| character_set_database | utf8mb4 |

| character_set_filesystem | binary |

| character_set_results | utf8mb4 |

| character_set_server | utf8mb4 |

| character_set_system | utf8mb3 |

| character_sets_dir | /usr/local/mysql8/share/charsets/ |

±-------------------------±----------------------------------+

8 rows in set (0.00 sec)

#当前mysql环境下的表

#oceanbase环境

obclient [test]> select version();

±-----------------------------+

| version() |

±-----------------------------+

| 5.7.25-OceanBase_CE-v4.2.1.2 |

±-----------------------------+

1 row in set (0.002 sec)

obclient [test]> show variables like '%chara%';

±-------------------------±--------+

| Variable_name | Value |

±-------------------------±--------+

| character_set_client | utf8mb4 |

| character_set_connection | utf8mb4 |

| character_set_database | utf8mb4 |

| character_set_filesystem | binary |

| character_set_results | utf8mb4 |

| character_set_server | utf8mb4 |

| character_set_system | utf8mb4 |

±-------------------------±--------+

7 rows in set (0.005 sec)

确认mysql与oceanbase的字符集一样

二、mysqldump迁移数据到OceanBase

通过MySQL下的mysqldump将数据导出为SQL文本格式,将数据备份文件传输到OceanBase数据库主机后,通过source命令导入到OceanBase数据库。

#当前mysql下的表

MySQL [(none)]> use test;

Reading table information for completion of table and column names

You can turn off this feature to get a quicker startup with -A

Database changed

MySQL [test]> show tables;

±------------------+

| Tables_in_test |

±------------------+

| cluster_test |

| cluster_test1 |

| cluster_test2 |

| cluster_test3 |

| cluster_test4 |

| t1 |

| t2 |

| t8 |

| t_smallint |

| test_clustered |

| test_nonclustered |

±------------------+

11 rows in set (0.00 sec)

通过mysqldump导出数据

mysqldump -h 192.168.150.162 -uroot -P4000 -p --database test > test_oceanbase.sql

#传输脚本到oceanbase服务器

scp test_oceanbase.sql 192.168.150.116:/home/admin

#oceanbase导入

obclient [test]> source test_oceanbase.sql

obclient [test]> show tables;

±------------------+

| Tables_in_test |

±------------------+

| cluster_test |

| cluster_test1 |

| cluster_test2 |

| cluster_test3 |

| cluster_test4 |

| t1 |

| t2 |

| t8 |

| t_smallint |

| test_clustered |

| test_nonclustered |

±------------------+

11 rows in set (0.004 sec)

#抽查表和数据已经导入

obclient [test]> select * from big

-> ;

±---------------------±--------------------+

| id | id1 |

±---------------------±--------------------+

| 18446744073709551615 | 9223372036854775807 |

±---------------------±--------------------+

1 row in set (0.003 sec)

三、通过datax从MySQL离线导入数据到OceanBase

#datax部署安装

datax 下载地址:https://datax-opensource.oss-cn-hangzhou.aliyuncs.com/202210/datax.tar.gz

1、直接服务器上下载datax

wget https://datax-opensource.oss-cn-hangzhou.aliyuncs.com/202210/datax.tar.gz

2、解压datax

[admin@localhost ~]$ tar zxvf datax.tar.gz

3、安装java

yum install java

4、测试datax是否安装成功

[admin@localhost bin]$ python datax.py .../job/job.json

DataX (DATAX-OPENSOURCE-3.0), From Alibaba !

Copyright © 2010-2017, Alibaba Group. All Rights Reserved.

2023-12-21 23:22:50.245 [main] INFO MessageSource - JVM TimeZone: GMT+08:00, Locale: zh_CN

2023-12-21 23:22:50.248 [main] INFO MessageSource - use Locale: zh_CN timeZone: sun.util.calendar.ZoneInfo[id="GMT+08:00",offset=28800000,dstSavings=0,useDaylight=false,transitions=0,lastRule=null]

2023-12-21 23:22:50.307 [main] INFO VMInfo - VMInfo# operatingSystem class => sun.management.OperatingSystemImpl

2023-12-21 23:22:50.312 [main] INFO Engine - the machine info =>

osInfo: Red Hat, Inc. 1.8 25.392-b08

jvmInfo: Linux amd64 3.10.0-1160.el7.x86_64

cpu num: 8

totalPhysicalMemory: -0.00G

freePhysicalMemory: -0.00G

maxFileDescriptorCount: -1

currentOpenFileDescriptorCount: -1

GC Names [PS MarkSweep, PS Scavenge]

MEMORY_NAME | allocation_size | init_size

PS Eden Space | 256.00MB | 256.00MB

Code Cache | 240.00MB | 2.44MB

Compressed Class Space | 1,024.00MB | 0.00MB

PS Survivor Space | 42.50MB | 42.50MB

PS Old Gen | 683.00MB | 683.00MB

Metaspace | -0.00MB | 0.00MB

2023-12-21 23:22:50.329 [main] INFO Engine -

{

"content":[

{

"reader":{

"name":"streamreader",

"parameter":{

"column":[

{

"type":"string",

"value":"DataX"

},

{

"type":"long",

"value":19890604

},

{

"type":"date",

"value":"1989-06-04 00:00:00"

},

{

"type":"bool",

"value":true

},

{

"type":"bytes",

"value":"test"

}

],

"sliceRecordCount":100000

}

},

"writer":{

"name":"streamwriter",

"parameter":{

"encoding":"UTF-8",

"print":false

}

}

}

],

"setting":{

"errorLimit":{

"percentage":0.02,

"record":0

},

"speed":{

"channel":1

}

}

}

2023-12-21 23:22:50.348 [main] WARN Engine - prioriy set to 0, because NumberFormatException, the value is: null

2023-12-21 23:22:50.361 [main] INFO PerfTrace - PerfTrace traceId=job_-1, isEnable=false, priority=0

2023-12-21 23:22:50.361 [main] INFO JobContainer - DataX jobContainer starts job.

2023-12-21 23:22:50.365 [main] INFO JobContainer - Set jobId = 0

2023-12-21 23:22:50.402 [job-0] INFO JobContainer - jobContainer starts to do prepare ...

2023-12-21 23:22:50.402 [job-0] INFO JobContainer - DataX Reader.Job [streamreader] do prepare work .

2023-12-21 23:22:50.402 [job-0] INFO JobContainer - DataX Writer.Job [streamwriter] do prepare work .

2023-12-21 23:22:50.402 [job-0] INFO JobContainer - jobContainer starts to do split ...

2023-12-21 23:22:50.403 [job-0] INFO JobContainer - Job set Channel-Number to 1 channels.

2023-12-21 23:22:50.403 [job-0] INFO JobContainer - DataX Reader.Job [streamreader] splits to [1] tasks.

2023-12-21 23:22:50.403 [job-0] INFO JobContainer - DataX Writer.Job [streamwriter] splits to [1] tasks.

2023-12-21 23:22:50.421 [job-0] INFO JobContainer - jobContainer starts to do schedule ...

2023-12-21 23:22:50.426 [job-0] INFO JobContainer - Scheduler starts [1] taskGroups.

2023-12-21 23:22:50.429 [job-0] INFO JobContainer - Running by standalone Mode.

2023-12-21 23:22:50.438 [taskGroup-0] INFO TaskGroupContainer - taskGroupId=[0] start [1] channels for [1] tasks.

2023-12-21 23:22:50.442 [taskGroup-0] INFO Channel - Channel set byte_speed_limit to -1, No bps activated.

2023-12-21 23:22:50.442 [taskGroup-0] INFO Channel - Channel set record_speed_limit to -1, No tps activated.

2023-12-21 23:22:50.468 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[0] attemptCount[1] is started

2023-12-21 23:22:50.789 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[0] is successed, used[335]ms

2023-12-21 23:22:50.790 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] completed it's tasks.

2023-12-21 23:23:00.452 [job-0] INFO StandAloneJobContainerCommunicator - Total 100000 records, 2600000 bytes | Speed 253.91KB/s, 10000 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.046s | All Task WaitReaderTime 0.056s | Percentage 100.00%

2023-12-21 23:23:00.453 [job-0] INFO AbstractScheduler - Scheduler accomplished all tasks.

2023-12-21 23:23:00.453 [job-0] INFO JobContainer - DataX Writer.Job [streamwriter] do post work.

2023-12-21 23:23:00.453 [job-0] INFO JobContainer - DataX Reader.Job [streamreader] do post work.

2023-12-21 23:23:00.453 [job-0] INFO JobContainer - DataX jobId [0] completed successfully.

2023-12-21 23:23:00.454 [job-0] INFO HookInvoker - No hook invoked, because base dir not exists or is a file: /home/admin/datax/hook

2023-12-21 23:23:00.455 [job-0] INFO JobContainer -

[total cpu info] =>

averageCpu | maxDeltaCpu | minDeltaCpu

-1.00% | -1.00% | -1.00%

[total gc info] =>

NAME | totalGCCount | maxDeltaGCCount | minDeltaGCCount | totalGCTime | maxDeltaGCTime | minDeltaGCTime

PS MarkSweep | 0 | 0 | 0 | 0.000s | 0.000s | 0.000s

PS Scavenge | 0 | 0 | 0 | 0.000s | 0.000s | 0.000s

2023-12-21 23:23:00.455 [job-0] INFO JobContainer - PerfTrace not enable!

2023-12-21 23:23:00.456 [job-0] INFO StandAloneJobContainerCommunicator - Total 100000 records, 2600000 bytes | Speed 253.91KB/s, 10000 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.046s | All Task WaitReaderTime 0.056s | Percentage 100.00%

2023-12-21 23:23:00.457 [job-0] INFO JobContainer -

任务启动时刻 : 2023-12-21 23:22:50

任务结束时刻 : 2023-12-21 23:23:00

任务总计耗时 : 10s

任务平均流量 : 253.91KB/s

记录写入速度 : 10000rec/s

读出记录总数 : 100000

读写失败总数 : 0

5创建datax-job的json

{

"job": {

"entry": {

"jvm": "-Xms1024m -Xmx1024m"

},

"setting": {

"speed": {

"channel": 4

},

"errorLimit": {

"record": 0,

"percentage": 0.1

}

},

"content": [{

"reader": {

"name": "mysqlreader",

"parameter": {

"username": "root",

"password": "oracle123",

"column": [

"*"

],

"connection": [{

"table": [

"Tab_A"

],

"jdbcUrl": ["jdbc:mysql://192.168.150.162:4000/test?useUnicode=true&characterEncoding=utf8&useSSL=false"]

}]

}

},

"writer": {

"name": "oceanbasev10writer",

"parameter": {

"obWriteMode": "insert",

"column": [

"*"

],

"preSql": [

"truncate table Tab_A"

],

"connection": [{

"jdbcUrl": "||dsc_ob10_dsc||obdemo:obmysql||dsc_ob10_dsc||jdbc:oceanbase://192.168.150.116:2883/test?useLocalSessionState=true&allowBatch=true&allowMultiQueries=true&rewriteBatchedStatements=true",

"table": [

"Tab_A"

]

}],

"username": "root",

"password": "oracle123",

"writerThreadCount": 10,

"batchSize": 1000,

"memstoreThreshold": "0.9"

}

}

}]

}

}

6、执行离线数据同步

源端数据:

MySQL [test]> select * from Tab_A;

±---±-----±-----±-----±-----±-----±-------+

| id | bid | cid | name | type | num | amt |

±---±-----±-----±-----±-----±-----±-------+

| 1 | 1 | 1 | A01 | 01 | 111 | 111.00 |

| 2 | 2 | 2 | A01 | 01 | 112 | 111.00 |

| 3 | 3 | 3 | A02 | 02 | 113 | 111.00 |

| 4 | 4 | 4 | A02 | 02 | 112 | 111.00 |

| 5 | 5 | 5 | A01 | 01 | 111 | 111.00 |

| 6 | 6 | 6 | A02 | 02 | 113 | 111.00 |

| 7 | 5 | 7 | A01 | 01 | 111 | 88.00 |

| 8 | 6 | 8 | A02 | 02 | 113 | 88.00 |

±---±-----±-----±-----±-----±-----±-------+

8 rows in set (0.26 sec)

目标数据:

obclient [test]> select * from Tab_A;

Empty set (0.133 sec)

执行同步:

python ./datax.py .../job/mysql2ob.json

2023-12-22 00:42:13.745 [job-0] INFO JobContainer - PerfTrace not enable!

2023-12-22 00:42:13.745 [job-0] INFO StandAloneJobContainerCommunicator - Total 8 records, 134 bytes | Speed 13B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.020s | All Task WaitReaderTime 0.000s | Percentage 100.00%

2023-12-22 00:42:13.747 [job-0] INFO JobContainer -

任务启动时刻 : 2023-12-22 00:41:54

任务结束时刻 : 2023-12-22 00:42:13

任务总计耗时 : 19s

任务平均流量 : 13B/s

记录写入速度 : 0rec/s

读出记录总数 : 8

读写失败总数 : 0

7、检查数据:

obclient [test]> select * from Tab_A;

±---±-----±-----±-----±-----±-----±-------+

| id | bid | cid | name | type | num | amt |

±---±-----±-----±-----±-----±-----±-------+

| 1 | 1 | 1 | A01 | 01 | 111 | 111.00 |

| 2 | 2 | 2 | A01 | 01 | 112 | 111.00 |

| 3 | 3 | 3 | A02 | 02 | 113 | 111.00 |

| 4 | 4 | 4 | A02 | 02 | 112 | 111.00 |

| 5 | 5 | 5 | A01 | 01 | 111 | 111.00 |

| 6 | 6 | 6 | A02 | 02 | 113 | 111.00 |

| 7 | 5 | 7 | A01 | 01 | 111 | 88.00 |

| 8 | 6 | 8 | A02 | 02 | 113 | 88.00 |

±---±-----±-----±-----±-----±-----±-------+

8 rows in set (0.002 sec)

相关推荐
OceanBase数据库官方博客1 天前
半连接转内连接 | OceanBase SQL 查询改写
sql·oceanbase·分布式数据库
OceanBase数据库官方博客2 天前
解析在OceanBase创建分区的常见问题|OceanBase 用户问题精粹
oceanbase·分布式数据库·分区
OceanBase数据库官方博客2 天前
半连接转内连接规则的原理与代码解析 |OceanBase查询优化
sql·oceanbase·分布式数据库
IT培训中心-竺老师5 天前
OceanBase 数据库分布式与集中式 能力
数据库·分布式·oceanbase
靖顺5 天前
【OceanBase 诊断调优】—— OceanBase 数据库网络速率配置方案
网络·数据库·oceanbase
尚雷558012 天前
OceanBase 社区版 4.0 离线方式升级bp1至bp2 指南(含避坑总结)
oceanbase
五月高高12 天前
Linux部署oceanbase
linux·oceanbase
靖顺16 天前
【OceanBase 诊断调优】—— 统计信息自动收集超时导致的估行不准 SQL 选择错索引
数据库·sql·oceanbase
it界的哈士奇16 天前
Oceanbase离线集群部署
oceanbase