ClickHouse集群搭建及ODBC配置

1 搭建环境

192.168.122.100 BCEuler01

192.168.122.101 BCEuler02

192.168.122.102 BCEuler03

1.1 关闭防火墙

systemctl disable firewalld

systemctl stop firewalld

1.2 关闭selinux

/etc/selinux/config

SELINUX=disabled

1.3 设置时间同步(chrony)

将BCEuler01设置为时钟同步源

BEEuler01的/etc/chrony.conf:

修改的部分:

server 192.168.122.100 iburst

allow 192.168.0.0/16

local stratum 10

BCEuler02的/etc/chrony.conf:

pool 192.168.122.100 iburst

BCEuler03的/etc/chrony.conf:

pool 192.168.122.100 iburst

启动:

systemctl enable chronyd

systemctl start chronyd

查看同步情况:

#chronyc sources v

2 搭建zokeeper集群

三台服务器搭建一个zookeeper集群

2.1 从官网下载apache-zookeeper-3.8.2-bin

3.9版本与clickhouse不匹配,会有问题

2.2 将apache-zookeeper-3.8.2-bin上传

三台服务器上都要上传到/opt目录下

2.3 配置zookeeper(zoo.cfg)

cd /opt/apache-zookeeper-3.8.2-bin/conf/

cp zoo_sample.cfg zoo.cfg

三台服务器配置相同,如下:

more zoo.cfg

tickTime=2000

initLimit=10

syncLimit=5

dataDir=/data/zookeeper/data

clientPort=2181

server.0=192.168.122.100:2888:3888

server.1=192.168.122.101:2888:3888

server.2=192.168.122.102:2888:3888

MaxSessionTimeout=120000

forceSync=no

autopurge.snapRetainCount=20

autopurge.purgeInterval=48

4lw.commands.whitelist=*

2.4 在三台服务器上创建目录

mkdir /data/zookeeper/data

2.5 不同服务器写入不同的myid文件

2.5.1 在BCEuler01(192.168.122.100)

echo 0 > /data/zookeeper/data/myid

2.5.2 在BCEuler02(192.168.122.101)

echo 1 > /data/zookeeper/data/myid

2.5.3 在BCEuler03(192.168.122.102)

echo 2 > /data/zookeeper/data/myid

2.6 启动zookeeper (三台服务器上都如此操作)

cd /opt/apache-zookeeper-3.8.2-bin

nohup bin/zkServer.sh start &

2.7 查看zookeeper状态

2.7.1 方法1

bin/zkServer.sh status

2.7.2 方法2

(此方法需要在zoo.cfg中增加 4lw.commands.whitelist=* ):

echo status | nc 192.168.122.100 2181

echo status | nc 192.168.122.101 2181

echo status | nc 192.168.122.102 2181

3.搭建clickhouse集群

集群规划

每个节点有2个分片,每个分片有2个副本;

1个分片就是在节点上启动1个clickhouse-server进程,

2个分片就是在节点上启动2个clickhouse-server进程,2个进程使用不同的端口,不同的配置文件 ;

节点:

BCEuler01

第一个分片配置config.xml,tcp_port 9001

第二个分片配置config2.xml, tcp_port 9002

BCEuler02

第一个分片配置config.xml,tcp_port 9001

第二个分片配置config2.xml, tcp_port 9002

3.1从官网下载,使用rpm 命令安装(二个节点都需要安装)

clickhouse-common-static-23.7.4.5.x86_64.rpm

clickhouse-client-23.7.4.5.x86_64.rpm

clickhouse-server-23.7.4.5.x86_64.rpm

3.2配置clickhouse

cd /etc/clickhouse-server/

3.2.1设置default用户密码(二个节点都需要如下操作)

修改users.d/default-password.xml文件,设置密码

3.2.2配置config.xml和config2.xml

因为一台节点上要有2个分片就是启2个clickhouse-server进程,所以要有两个config.xml(二个节点都需要如下操作)

cp config.xml config2.xml

配置文件(下面列出4个配置文件 ,4个配置文件内容基本相同,红字黄底的不一样的部分, 其它都相同,红字的为重要部署):

3.2.2.1节点1(BCEuler01 config.xml)

(只列出了修改的部分)

<clickhouse>

<logger>

<level>test</level>

<log>/var/log/clickhouse-server/clickhouse-server.log</log>

<errorlog>/var/log/clickhouse-server/clickhouse-server.err.log</errorlog>

</logger>

<http_port>8123</http_port>
<tcp_port>9001</tcp_port>
<mysql_port>9004</mysql_port>
<postgresql_port>9005</postgresql_port>
<interserver_http_port>9009</interserver_http_port>

<!-- Path to data directory, with trailing slash. -->

<path>/data/clickhouse/</path>

<!-- Path to temporary data for processing hard queries. -->

<tmp_path>/data/clickhouse/tmp/</tmp_path>

<!-- Directory with user provided files that are accessible by 'file' table function. -->

<user_files_path>/data/clickhouse/user_files/</user_files_path>

<!-- Path to folder where users created by SQL commands are stored. -->

<path>/data/clickhouse/access/</path>

<remote_servers>
<!-- Test only shard config for testing distributed storage -->
<ck_2shard_2replica_cluster>

<shard>
<internal_replication>true</internal_replication>
<weight>1</weight>
<replica>
<host>BCEuler01</host>
<port>9001</port>
</replica>

<replica>
<host>BCEuler02</host>
<port>9002</port>
</replica>
</shard>

<shard>
<internal_replication>true</internal_replication>
<weight>1</weight>
<replica>
<host>BCEuler02</host>
<port>9001</port>
</replica>

<replica>
<host>BCEuler01</host>
<port>9002</port>
</replica>
</shard>
</ck_2shard_2replica_cluster>
</remote_servers>

<zookeeper>
<node>
<host>192.168.122.100</host>
<port>2181</port>
</node>
<node>
<host>192.168.122.101</host>
<port>2181</port>
</node>
<node>
<host>192.168.122.102</host>
<port>2181</port>
</node>
</zookeeper>

<macros>
<replica>BCEuler0101</replica>
<shard>01</shard>
</macros>

<!-- Directory in <clickhouse-path> containing schema files for various input formats.

The directory will be created if it doesn't exist.

-->

<format_schema_path>/data/clickhouse/format_schemas/</format_schema_path>

-->

</clickhouse>

解释:

3.2.2.1.1 端口配置

同一个节点的config.xml和config2.xm中配置的不同,(同一个端口只被一个进程使用,所以config2.xml要更换端口), tcp_port在配置集群时要用到;;

<http_port>8123</http_port>

<tcp_port>9001</tcp_port>

<mysql_port>9004</mysql_port>

<postgresql_port>9005</postgresql_port>

<interserver_http_port>9009</interserver_http_port>

3.2.2.1.2 集群shard配置

<remote_servers>

<!-- Test only shard config for testing distributed storage -->

<ck_2shard_2replica_cluster>

<shard>

<internal_replication>true</internal_replication>

<weight>1</weight>

<replica>

<host>BCEuler01</host>

<port>9001</port>

</replica>

<replica>

<host>BCEuler02</host>

<port>9002</port>

</replica>

</shard>

<shard>

<internal_replication>true</internal_replication>

<weight>1</weight>

<replica>

<host>BCEuler02</host>

<port>9001</port>

</replica>

<replica>

<host>BCEuler01</host>

<port>9002</port>

</replica>

</shard>

</ck_2shard_2replica_cluster>

</remote_servers>

配置:

<shard>

.....

</shard>

代表一个分片,因为每个节点2个分片,所以会有两个这个结构

第一分片:

<shard>

<internal_replication>true</internal_replication>

<weight>1</weight>

<replica>

<host>BCEuler01</host>

<port>9001</port>

</replica>

<replica>

<host>BCEuler02</host>

<port>9002</port>

</replica>

</shard>

BCEuler01的第1个clickhouse-server进程 和 BCEuler02的第2个clickhouse-server进程

第二分片

<shard>

<internal_replication>true</internal_replication>

<weight>1</weight>

<replica>

<host>BCEuler02</host>

<port>9001</port>

</replica>

<replica>

<host>BCEuler01</host>

<port>9002</port>

</replica>

</shard>

BCEuler01的第2个clickhouse-server进程 和 BCEuler02的第1个clickhouse-server进程

总结:

BCEuler01上 :

第1个clickhouse-server进程为第1分片

第2个clickhouse-server进程为第2分片

BCEuler02上 :

第1个clickhouse-server进程为第2分片

第2个clickhouse-server进程为第1分片

3.2.2.1.3 集群relica配置

<relica>

</relica>

代表一个副本,每个分片有2个副本,所以有两个这个结构

第1个分片有2个副本:

<replica>

<host>BCEuler01</host>

<port>9001</port>

</replica>

<replica>

<host>BCEuler02</host>

<port>9002</port>

</replica>

第2个分片有2个副本:

<replica>

<host>BCEuler02</host>

<port>9001</port>

</replica>

<replica>

<host>BCEuler01</host>

<port>9002</port>

</replica>

3.2.2.1.4 zookeeper配置

<zookeeper>

<node>

<host>192.168.122.100</host>

<port>2181</port>

</node>

<node>

<host>192.168.122.101</host>

<port>2181</port>

</node>

<node>

<host>192.168.122.102</host>

<port>2181</port>

</node>

</zookeeper>

3.2.2.1.5 macros配置

<macros>

<replica>BCEuler0101</replica>

<shard>01</shard>

</macros>

3.2.2.1.6 其它部分配置

<remote_servers>

<!-- Test only shard config for testing distributed storage -->

<ck_2shard_2replica_cluster>

.....

</ck_2shard_2replica_cluster>

</remote_servers>

ck_2shard_2replica_cluster集群名称,在使用create语句创建库和表时,需要加上 on cluster ck_2shard_2replica_cluster

<shard>

<internal_replication>true</internal_replication>

<weight>1</weight>

</shard>

internal_replication 开启数据同步的方式

<weight>1</weight> 分片之间数据分配的权重

3.2.2.2节点1(BCEuler01 config2.xml)

只列出不同部分:

<http_port>8124</http_port>

<tcp_port>9002</tcp_port>

<mysql_port>9014</mysql_port>

<postgresql_port>9015</postgresql_port>

<interserver_http_port>9019</interserver_http_port>

<macros>

<replica>BCEuler0102</replica>

<shard>02</shard>

</macros>

3.2.2.3 节点2(BCEuler02 config.xml)

<http_port>8123</http_port>

<tcp_port>9001</tcp_port>

<mysql_port>9004</mysql_port>

<postgresql_port>9005</postgresql_port>

<interserver_http_port>9009</interserver_http_port>

<macros>

<replica>BCEuler0101</replica>

<shard>02</shard>

</macros>

3.2.2.4 节点2(BCEuler02 config2.xml)

<http_port>8124</http_port>

<tcp_port>9002</tcp_port>

<mysql_port>9014</mysql_port>

<postgresql_port>9015</postgresql_port>

<interserver_http_port>9009</interserver_http_port>

<macros>

<replica>BCEuler0102</replica>

<shard>01</shard>

</macros>

3.2.2.5 说明

对于macros中shard和relica的定义,逻辑如下:

shard: 如果是第1个分片,那shard为01, 第2个分片shard为02

relica:如果是第1个副本,那取 "主机名01",第2个副本,那取"主机名02"

3.3启动进程(2个节点同样操作)

第一个分片:

nohup clickhouse-server --config-file /etc/clickhouse-server/config.xml &

第2个分片:

nohup clickhouse-server --config-file /etc/clickhouse-server/config2.xml

4 查看click-house日志

tail -f >/var/log/clickhouse-server/clickhouse-server.log

5 ODBC for PostgreSQL 配置

5.1 安装unixODBC

yum install unixODBC-devel unixODBC

5.2 安装psqlodbc-15.00.0000

从postgresql官网去找到包下载

源码编译安装psqlodbc主要是为了获取psqlodbcw.so和libodbcpsqlS.so

编译安装后,将2个.so包放到/usr/lib64目录下

cp /usr/local/lib/psqlodbcw.so /usr/lib64/

cp /usr/local/lib/libodbcpsqlS.so /usr/lib64/

5.3 配置odbc.ini和odbcinst.ini

odbc.ini odbcinst.ini

5.3.1 配置odbcinst.ini

more /etc/odbcinst.ini

Example driver definitions

Driver from the postgresql-odbc package

Setup from the unixODBC package

[PostgreSQL]

Description = ODBC for PostgreSQL

Driver = /usr/lib/psqlodbcw.so

Setup = /usr/lib/libodbcpsqlS.so

Driver64 = /usr/lib64/psqlodbcw.so

Setup64 = /usr/lib64/libodbcpsqlS.so

FileUsage = 1

Driver from the mysql-connector-odbc package

Setup from the unixODBC package

[MySQL]

Description = ODBC for MySQL

Driver = /usr/lib/libmyodbc5.so

Setup = /usr/lib/libodbcmyS.so

Driver64 = /usr/lib64/libmyodbc5.so

Setup64 = /usr/lib64/libodbcmyS.so

FileUsage = 1

5.3.2 配置odbc.ini

more /etc/odbc.ini

注: [postgresql] 为数据库连接字符串,类似oracle的database link名称

[DEFAULT]

Driver = postgresql

[postgresql]

Description = GP

Driver = PostgreSQL

Database = gpdbtest

Servername = 192.168.122.102

UserName = gpadmin

Password = 1233455

Port = 5432

ReadOnly = 0

ConnSettings = set client_encoding to UTF8

5.3.3 验证odbc是否正确

查看是否可以连接到远程的greenplum数据库:

isql postgresql gpadmin 1233455

会有Connected字样!

6 连接GP数据库

6.1 创建dictionary,使用数据库连接字符串postgresql

CREATE DICTIONARY on cluster ck_2shard_2replica_cluster test(

`id` Int64 DEFAULT 0,

`name` String DEFAULT 'a'

)

PRIMARY KEY id

SOURCE(ODBC(CONNECTION_STRING 'DSN=postgresql' TABLE 'test_data'))

LIFETIME(MIN 180 MAX 240)

LAYOUT(HASHED())

postgresq为database link名称,在/etc/odbc.ini配置

6.2 在clickhouse库中查询greenplum库数据

clickhouse-client --port=9001 --user=default

输入default用户的密码

BCEuler01 :) select * from test limit 10;

如果可以查出数据,说明连接没有问题

6.3 odbc说明

6.3.1 如greenplum的gpadmin用户密码包含特殊字段(%_@等)

在/etc/odbc.ini配置密码后,会提示密码错误,对于特殊字符的密码如果配置,还未解决

6.3.2 在odbc配置中增加日志和日志级别,可以查看具体问题

(未完待更)

相关推荐
传而习乎1 小时前
Linux:CentOS 7 解压 7zip 压缩的文件
linux·运维·centos
soulteary1 小时前
突破内存限制:Mac Mini M2 服务器化实践指南
运维·服务器·redis·macos·arm·pika
运维&陈同学2 小时前
【zookeeper01】消息队列与微服务之zookeeper工作原理
运维·分布式·微服务·zookeeper·云原生·架构·消息队列
是阿建吖!2 小时前
【Linux】进程状态
linux·运维
明明跟你说过3 小时前
Linux中的【tcpdump】:深入介绍与实战使用
linux·运维·测试工具·tcpdump
Mr_Xuhhh4 小时前
重生之我在学环境变量
linux·运维·服务器·前端·chrome·算法
爱上口袋的天空11 小时前
09 - Clickhouse的SQL操作
数据库·sql·clickhouse
朝九晚五ฺ11 小时前
【Linux探索学习】第十四弹——进程优先级:深入理解操作系统中的进程优先级
linux·运维·学习
Kkooe12 小时前
GitLab|数据迁移
运维·服务器·git
久醉不在酒13 小时前
MySQL数据库运维及集群搭建
运维·数据库·mysql