Oceanbase学习之一迁移mysql数据到oceanbase

一、数据库环境

#mysql环境

root@192.168.150.162 20:28: (none)> select version();

±----------+

| version() |

±----------+

| 8.0.26 |

±----------+

1 row in set (0.00 sec)

root@192.168.150.162 20:28: (none)> show variables like '%char%';

±-------------------------±----------------------------------+

| Variable_name | Value |

±-------------------------±----------------------------------+

| character_set_client | utf8mb4 |

| character_set_connection | utf8mb4 |

| character_set_database | utf8mb4 |

| character_set_filesystem | binary |

| character_set_results | utf8mb4 |

| character_set_server | utf8mb4 |

| character_set_system | utf8mb3 |

| character_sets_dir | /usr/local/mysql8/share/charsets/ |

±-------------------------±----------------------------------+

8 rows in set (0.00 sec)

#当前mysql环境下的表

#oceanbase环境

obclient test> select version();

±-----------------------------+

| version() |

±-----------------------------+

| 5.7.25-OceanBase_CE-v4.2.1.2 |

±-----------------------------+

1 row in set (0.002 sec)

obclient test> show variables like '%chara%';

±-------------------------±--------+

| Variable_name | Value |

±-------------------------±--------+

| character_set_client | utf8mb4 |

| character_set_connection | utf8mb4 |

| character_set_database | utf8mb4 |

| character_set_filesystem | binary |

| character_set_results | utf8mb4 |

| character_set_server | utf8mb4 |

| character_set_system | utf8mb4 |

±-------------------------±--------+

7 rows in set (0.005 sec)

确认mysql与oceanbase的字符集一样

二、mysqldump迁移数据到OceanBase

通过MySQL下的mysqldump将数据导出为SQL文本格式,将数据备份文件传输到OceanBase数据库主机后,通过source命令导入到OceanBase数据库。

#当前mysql下的表

MySQL (none)> use test;

Reading table information for completion of table and column names

You can turn off this feature to get a quicker startup with -A

Database changed

MySQL test> show tables;

±------------------+

| Tables_in_test |

±------------------+

| cluster_test |

| cluster_test1 |

| cluster_test2 |

| cluster_test3 |

| cluster_test4 |

| t1 |

| t2 |

| t8 |

| t_smallint |

| test_clustered |

| test_nonclustered |

±------------------+

11 rows in set (0.00 sec)

通过mysqldump导出数据

mysqldump -h 192.168.150.162 -uroot -P4000 -p --database test > test_oceanbase.sql

#传输脚本到oceanbase服务器

scp test_oceanbase.sql 192.168.150.116:/home/admin

#oceanbase导入

obclient test> source test_oceanbase.sql

obclient test> show tables;

±------------------+

| Tables_in_test |

±------------------+

| cluster_test |

| cluster_test1 |

| cluster_test2 |

| cluster_test3 |

| cluster_test4 |

| t1 |

| t2 |

| t8 |

| t_smallint |

| test_clustered |

| test_nonclustered |

±------------------+

11 rows in set (0.004 sec)

#抽查表和数据已经导入

obclient test> select * from big

-> ;

±---------------------±--------------------+

| id | id1 |

±---------------------±--------------------+

| 18446744073709551615 | 9223372036854775807 |

±---------------------±--------------------+

1 row in set (0.003 sec)

三、通过datax从MySQL离线导入数据到OceanBase

#datax部署安装

datax 下载地址:https://datax-opensource.oss-cn-hangzhou.aliyuncs.com/202210/datax.tar.gz

1、直接服务器上下载datax

wget https://datax-opensource.oss-cn-hangzhou.aliyuncs.com/202210/datax.tar.gz

2、解压datax

admin@localhost \~$ tar zxvf datax.tar.gz

3、安装java

yum install java

4、测试datax是否安装成功

admin@localhost bin$ python datax.py .../job/job.json

DataX (DATAX-OPENSOURCE-3.0), From Alibaba !

Copyright © 2010-2017, Alibaba Group. All Rights Reserved.

2023-12-21 23:22:50.245 main INFO MessageSource - JVM TimeZone: GMT+08:00, Locale: zh_CN

2023-12-21 23:22:50.248 main INFO MessageSource - use Locale: zh_CN timeZone: sun.util.calendar.ZoneInfoid="GMT+08:00",offset=28800000,dstSavings=0,useDaylight=false,transitions=0,lastRule=null

2023-12-21 23:22:50.307 main INFO VMInfo - VMInfo# operatingSystem class => sun.management.OperatingSystemImpl

2023-12-21 23:22:50.312 main INFO Engine - the machine info =>

osInfo: Red Hat, Inc. 1.8 25.392-b08

jvmInfo: Linux amd64 3.10.0-1160.el7.x86_64

cpu num: 8

totalPhysicalMemory: -0.00G

freePhysicalMemory: -0.00G

maxFileDescriptorCount: -1

currentOpenFileDescriptorCount: -1

GC Names PS MarkSweep, PS Scavenge

MEMORY_NAME | allocation_size | init_size

PS Eden Space | 256.00MB | 256.00MB

Code Cache | 240.00MB | 2.44MB

Compressed Class Space | 1,024.00MB | 0.00MB

PS Survivor Space | 42.50MB | 42.50MB

PS Old Gen | 683.00MB | 683.00MB

Metaspace | -0.00MB | 0.00MB

2023-12-21 23:22:50.329 main INFO Engine -

{

"content":[

{

"reader":{

"name":"streamreader",

"parameter":{

"column":[

{

"type":"string",

"value":"DataX"

},

{

"type":"long",

"value":19890604

},

{

"type":"date",

"value":"1989-06-04 00:00:00"

},

{

"type":"bool",

"value":true

},

{

"type":"bytes",

"value":"test"

}

],

"sliceRecordCount":100000

}

},

"writer":{

"name":"streamwriter",

"parameter":{

"encoding":"UTF-8",

"print":false

}

}

}

],

"setting":{

"errorLimit":{

"percentage":0.02,

"record":0

},

"speed":{

"channel":1

}

}

}

2023-12-21 23:22:50.348 main WARN Engine - prioriy set to 0, because NumberFormatException, the value is: null

2023-12-21 23:22:50.361 main INFO PerfTrace - PerfTrace traceId=job_-1, isEnable=false, priority=0

2023-12-21 23:22:50.361 main INFO JobContainer - DataX jobContainer starts job.

2023-12-21 23:22:50.365 main INFO JobContainer - Set jobId = 0

2023-12-21 23:22:50.402 job-0 INFO JobContainer - jobContainer starts to do prepare ...

2023-12-21 23:22:50.402 job-0 INFO JobContainer - DataX Reader.Job streamreader do prepare work .

2023-12-21 23:22:50.402 job-0 INFO JobContainer - DataX Writer.Job streamwriter do prepare work .

2023-12-21 23:22:50.402 job-0 INFO JobContainer - jobContainer starts to do split ...

2023-12-21 23:22:50.403 job-0 INFO JobContainer - Job set Channel-Number to 1 channels.

2023-12-21 23:22:50.403 job-0 INFO JobContainer - DataX Reader.Job streamreader splits to 1 tasks.

2023-12-21 23:22:50.403 job-0 INFO JobContainer - DataX Writer.Job streamwriter splits to 1 tasks.

2023-12-21 23:22:50.421 job-0 INFO JobContainer - jobContainer starts to do schedule ...

2023-12-21 23:22:50.426 job-0 INFO JobContainer - Scheduler starts 1 taskGroups.

2023-12-21 23:22:50.429 job-0 INFO JobContainer - Running by standalone Mode.

2023-12-21 23:22:50.438 taskGroup-0 INFO TaskGroupContainer - taskGroupId=0 start 1 channels for 1 tasks.

2023-12-21 23:22:50.442 taskGroup-0 INFO Channel - Channel set byte_speed_limit to -1, No bps activated.

2023-12-21 23:22:50.442 taskGroup-0 INFO Channel - Channel set record_speed_limit to -1, No tps activated.

2023-12-21 23:22:50.468 taskGroup-0 INFO TaskGroupContainer - taskGroup0 taskId0 attemptCount1 is started

2023-12-21 23:22:50.789 taskGroup-0 INFO TaskGroupContainer - taskGroup0 taskId0 is successed, used335ms

2023-12-21 23:22:50.790 taskGroup-0 INFO TaskGroupContainer - taskGroup0 completed it's tasks.

2023-12-21 23:23:00.452 job-0 INFO StandAloneJobContainerCommunicator - Total 100000 records, 2600000 bytes | Speed 253.91KB/s, 10000 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.046s | All Task WaitReaderTime 0.056s | Percentage 100.00%

2023-12-21 23:23:00.453 job-0 INFO AbstractScheduler - Scheduler accomplished all tasks.

2023-12-21 23:23:00.453 job-0 INFO JobContainer - DataX Writer.Job streamwriter do post work.

2023-12-21 23:23:00.453 job-0 INFO JobContainer - DataX Reader.Job streamreader do post work.

2023-12-21 23:23:00.453 job-0 INFO JobContainer - DataX jobId 0 completed successfully.

2023-12-21 23:23:00.454 job-0 INFO HookInvoker - No hook invoked, because base dir not exists or is a file: /home/admin/datax/hook

2023-12-21 23:23:00.455 job-0 INFO JobContainer -

total cpu info =>

averageCpu | maxDeltaCpu | minDeltaCpu

-1.00% | -1.00% | -1.00%

total gc info =>

NAME | totalGCCount | maxDeltaGCCount | minDeltaGCCount | totalGCTime | maxDeltaGCTime | minDeltaGCTime

PS MarkSweep | 0 | 0 | 0 | 0.000s | 0.000s | 0.000s

PS Scavenge | 0 | 0 | 0 | 0.000s | 0.000s | 0.000s

2023-12-21 23:23:00.455 job-0 INFO JobContainer - PerfTrace not enable!

2023-12-21 23:23:00.456 job-0 INFO StandAloneJobContainerCommunicator - Total 100000 records, 2600000 bytes | Speed 253.91KB/s, 10000 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.046s | All Task WaitReaderTime 0.056s | Percentage 100.00%

2023-12-21 23:23:00.457 job-0 INFO JobContainer -

任务启动时刻 : 2023-12-21 23:22:50

任务结束时刻 : 2023-12-21 23:23:00

任务总计耗时 : 10s

任务平均流量 : 253.91KB/s

记录写入速度 : 10000rec/s

读出记录总数 : 100000

读写失败总数 : 0

5创建datax-job的json

{

"job": {

"entry": {

"jvm": "-Xms1024m -Xmx1024m"

},

"setting": {

"speed": {

"channel": 4

},

"errorLimit": {

"record": 0,

"percentage": 0.1

}

},

"content": [{

"reader": {

"name": "mysqlreader",

"parameter": {

"username": "root",

"password": "oracle123",

"column": [

"*"

],

"connection": [{

"table": [

"Tab_A"

],

"jdbcUrl": "jdbc:mysql://192.168.150.162:4000/test?useUnicode=true\&characterEncoding=utf8\&useSSL=false"

}]

}

},

"writer": {

"name": "oceanbasev10writer",

"parameter": {

"obWriteMode": "insert",

"column": [

"*"

],

"preSql": [

"truncate table Tab_A"

],

"connection": [{

"jdbcUrl": "||dsc_ob10_dsc||obdemo:obmysql||dsc_ob10_dsc||jdbc:oceanbase://192.168.150.116:2883/test?useLocalSessionState=true&allowBatch=true&allowMultiQueries=true&rewriteBatchedStatements=true",

"table": [

"Tab_A"

]

}],

"username": "root",

"password": "oracle123",

"writerThreadCount": 10,

"batchSize": 1000,

"memstoreThreshold": "0.9"

}

}

}]

}

}

6、执行离线数据同步

源端数据:

MySQL test> select * from Tab_A;

±---±-----±-----±-----±-----±-----±-------+

| id | bid | cid | name | type | num | amt |

±---±-----±-----±-----±-----±-----±-------+

| 1 | 1 | 1 | A01 | 01 | 111 | 111.00 |

| 2 | 2 | 2 | A01 | 01 | 112 | 111.00 |

| 3 | 3 | 3 | A02 | 02 | 113 | 111.00 |

| 4 | 4 | 4 | A02 | 02 | 112 | 111.00 |

| 5 | 5 | 5 | A01 | 01 | 111 | 111.00 |

| 6 | 6 | 6 | A02 | 02 | 113 | 111.00 |

| 7 | 5 | 7 | A01 | 01 | 111 | 88.00 |

| 8 | 6 | 8 | A02 | 02 | 113 | 88.00 |

±---±-----±-----±-----±-----±-----±-------+

8 rows in set (0.26 sec)

目标数据:

obclient test> select * from Tab_A;

Empty set (0.133 sec)

执行同步:

python ./datax.py .../job/mysql2ob.json

2023-12-22 00:42:13.745 job-0 INFO JobContainer - PerfTrace not enable!

2023-12-22 00:42:13.745 job-0 INFO StandAloneJobContainerCommunicator - Total 8 records, 134 bytes | Speed 13B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.020s | All Task WaitReaderTime 0.000s | Percentage 100.00%

2023-12-22 00:42:13.747 job-0 INFO JobContainer -

任务启动时刻 : 2023-12-22 00:41:54

任务结束时刻 : 2023-12-22 00:42:13

任务总计耗时 : 19s

任务平均流量 : 13B/s

记录写入速度 : 0rec/s

读出记录总数 : 8

读写失败总数 : 0

7、检查数据:

obclient test> select * from Tab_A;

±---±-----±-----±-----±-----±-----±-------+

| id | bid | cid | name | type | num | amt |

±---±-----±-----±-----±-----±-----±-------+

| 1 | 1 | 1 | A01 | 01 | 111 | 111.00 |

| 2 | 2 | 2 | A01 | 01 | 112 | 111.00 |

| 3 | 3 | 3 | A02 | 02 | 113 | 111.00 |

| 4 | 4 | 4 | A02 | 02 | 112 | 111.00 |

| 5 | 5 | 5 | A01 | 01 | 111 | 111.00 |

| 6 | 6 | 6 | A02 | 02 | 113 | 111.00 |

| 7 | 5 | 7 | A01 | 01 | 111 | 88.00 |

| 8 | 6 | 8 | A02 | 02 | 113 | 88.00 |

±---±-----±-----±-----±-----±-----±-------+

8 rows in set (0.002 sec)

相关推荐
OceanBase数据库官方博客20 分钟前
OceanBase 4.4.2 LTS:Agent时代需要数据库超越存储角色
oceanbase
superonion06204 天前
【DB2】【Oceanbase】使用OMS将DB2迁移到Oceanbase测试
oceanbase·db
OceanBase数据库官方博客10 天前
OceanBase助力渤海银行:核心系统升级的实践与思考
oceanbase
OceanBase数据库官方博客10 天前
常州公积金采用OceanBase,三年稳定运行并实现智慧服务新范式
数据库·oceanbase
OceanBase数据库官方博客16 天前
现代数据架构:一套技术栈统一 TP、AP 与 AI
架构·oceanbase
粉墨白伶18 天前
【OceanBase】社区版三节点多副本容灾部署方案
oceanbase
OceanBase数据库官方博客18 天前
OceanBase seekdb-cli:专为 AI Agent 设计的数据库接口
数据库·人工智能·oceanbase
雅俗数据库22 天前
OCP实验 | 02-全链路诊断
oceanbase
雅俗数据库22 天前
OCP实验 | 05-运维管理
oceanbase
落日流年22 天前
欧拉操作系统部署OceanBase集群
运维·oceanbase