ClickHouse集群搭建及ODBC配置

1 搭建环境

192.168.122.100 BCEuler01

192.168.122.101 BCEuler02

192.168.122.102 BCEuler03

1.1 关闭防火墙

systemctl disable firewalld

systemctl stop firewalld

1.2 关闭selinux

/etc/selinux/config

SELINUX=disabled

1.3 设置时间同步(chrony)

将BCEuler01设置为时钟同步源

BEEuler01的/etc/chrony.conf:

修改的部分:

server 192.168.122.100 iburst

allow 192.168.0.0/16

local stratum 10

BCEuler02的/etc/chrony.conf:

pool 192.168.122.100 iburst

BCEuler03的/etc/chrony.conf:

pool 192.168.122.100 iburst

启动:

systemctl enable chronyd

systemctl start chronyd

查看同步情况:

#chronyc sources v

2 搭建zokeeper集群

三台服务器搭建一个zookeeper集群

2.1 从官网下载apache-zookeeper-3.8.2-bin

3.9版本与clickhouse不匹配,会有问题

2.2 将apache-zookeeper-3.8.2-bin上传

三台服务器上都要上传到/opt目录下

2.3 配置zookeeper(zoo.cfg)

cd /opt/apache-zookeeper-3.8.2-bin/conf/

cp zoo_sample.cfg zoo.cfg

三台服务器配置相同,如下:

more zoo.cfg

tickTime=2000

initLimit=10

syncLimit=5

dataDir=/data/zookeeper/data

clientPort=2181

server.0=192.168.122.100:2888:3888

server.1=192.168.122.101:2888:3888

server.2=192.168.122.102:2888:3888

MaxSessionTimeout=120000

forceSync=no

autopurge.snapRetainCount=20

autopurge.purgeInterval=48

4lw.commands.whitelist=*

2.4 在三台服务器上创建目录

mkdir /data/zookeeper/data

2.5 不同服务器写入不同的myid文件

2.5.1 在BCEuler01(192.168.122.100)

echo 0 > /data/zookeeper/data/myid

2.5.2 在BCEuler02(192.168.122.101)

echo 1 > /data/zookeeper/data/myid

2.5.3 在BCEuler03(192.168.122.102)

echo 2 > /data/zookeeper/data/myid

2.6 启动zookeeper (三台服务器上都如此操作)

cd /opt/apache-zookeeper-3.8.2-bin

nohup bin/zkServer.sh start &

2.7 查看zookeeper状态

2.7.1 方法1

bin/zkServer.sh status

2.7.2 方法2

(此方法需要在zoo.cfg中增加 4lw.commands.whitelist=* ):

echo status | nc 192.168.122.100 2181

echo status | nc 192.168.122.101 2181

echo status | nc 192.168.122.102 2181

3.搭建clickhouse集群

集群规划

每个节点有2个分片,每个分片有2个副本;

1个分片就是在节点上启动1个clickhouse-server进程,

2个分片就是在节点上启动2个clickhouse-server进程,2个进程使用不同的端口,不同的配置文件 ;

节点:

BCEuler01

第一个分片配置config.xml,tcp_port 9001

第二个分片配置config2.xml, tcp_port 9002

BCEuler02

第一个分片配置config.xml,tcp_port 9001

第二个分片配置config2.xml, tcp_port 9002

3.1从官网下载,使用rpm 命令安装(二个节点都需要安装)

clickhouse-common-static-23.7.4.5.x86_64.rpm

clickhouse-client-23.7.4.5.x86_64.rpm

clickhouse-server-23.7.4.5.x86_64.rpm

3.2配置clickhouse

cd /etc/clickhouse-server/

3.2.1设置default用户密码(二个节点都需要如下操作)

修改users.d/default-password.xml文件,设置密码

3.2.2配置config.xml和config2.xml

因为一台节点上要有2个分片就是启2个clickhouse-server进程,所以要有两个config.xml(二个节点都需要如下操作)

cp config.xml config2.xml

配置文件(下面列出4个配置文件 ,4个配置文件内容基本相同,红字黄底的不一样的部分, 其它都相同,红字的为重要部署):

3.2.2.1节点1(BCEuler01 config.xml)

(只列出了修改的部分)

<clickhouse>

<logger>

<level>test</level>

<log>/var/log/clickhouse-server/clickhouse-server.log</log>

<errorlog>/var/log/clickhouse-server/clickhouse-server.err.log</errorlog>

</logger>

<http_port>8123</http_port>
<tcp_port>9001</tcp_port>
<mysql_port>9004</mysql_port>
<postgresql_port>9005</postgresql_port>
<interserver_http_port>9009</interserver_http_port>

<!-- Path to data directory, with trailing slash. -->

<path>/data/clickhouse/</path>

<!-- Path to temporary data for processing hard queries. -->

<tmp_path>/data/clickhouse/tmp/</tmp_path>

<!-- Directory with user provided files that are accessible by 'file' table function. -->

<user_files_path>/data/clickhouse/user_files/</user_files_path>

<!-- Path to folder where users created by SQL commands are stored. -->

<path>/data/clickhouse/access/</path>

<remote_servers>
<!-- Test only shard config for testing distributed storage -->
<ck_2shard_2replica_cluster>

<shard>
<internal_replication>true</internal_replication>
<weight>1</weight>
<replica>
<host>BCEuler01</host>
<port>9001</port>
</replica>

<replica>
<host>BCEuler02</host>
<port>9002</port>
</replica>
</shard>

<shard>
<internal_replication>true</internal_replication>
<weight>1</weight>
<replica>
<host>BCEuler02</host>
<port>9001</port>
</replica>

<replica>
<host>BCEuler01</host>
<port>9002</port>
</replica>
</shard>
</ck_2shard_2replica_cluster>
</remote_servers>

<zookeeper>
<node>
<host>192.168.122.100</host>
<port>2181</port>
</node>
<node>
<host>192.168.122.101</host>
<port>2181</port>
</node>
<node>
<host>192.168.122.102</host>
<port>2181</port>
</node>
</zookeeper>

<macros>
<replica>BCEuler0101</replica>
<shard>01</shard>
</macros>

<!-- Directory in <clickhouse-path> containing schema files for various input formats.

The directory will be created if it doesn't exist.

-->

<format_schema_path>/data/clickhouse/format_schemas/</format_schema_path>

-->

</clickhouse>

解释:

3.2.2.1.1 端口配置

同一个节点的config.xml和config2.xm中配置的不同,(同一个端口只被一个进程使用,所以config2.xml要更换端口), tcp_port在配置集群时要用到;;

<http_port>8123</http_port>

<tcp_port>9001</tcp_port>

<mysql_port>9004</mysql_port>

<postgresql_port>9005</postgresql_port>

<interserver_http_port>9009</interserver_http_port>

3.2.2.1.2 集群shard配置

<remote_servers>

<!-- Test only shard config for testing distributed storage -->

<ck_2shard_2replica_cluster>

<shard>

<internal_replication>true</internal_replication>

<weight>1</weight>

<replica>

<host>BCEuler01</host>

<port>9001</port>

</replica>

<replica>

<host>BCEuler02</host>

<port>9002</port>

</replica>

</shard>

<shard>

<internal_replication>true</internal_replication>

<weight>1</weight>

<replica>

<host>BCEuler02</host>

<port>9001</port>

</replica>

<replica>

<host>BCEuler01</host>

<port>9002</port>

</replica>

</shard>

</ck_2shard_2replica_cluster>

</remote_servers>

配置:

<shard>

.....

</shard>

代表一个分片,因为每个节点2个分片,所以会有两个这个结构

第一分片:

<shard>

<internal_replication>true</internal_replication>

<weight>1</weight>

<replica>

<host>BCEuler01</host>

<port>9001</port>

</replica>

<replica>

<host>BCEuler02</host>

<port>9002</port>

</replica>

</shard>

BCEuler01的第1个clickhouse-server进程 和 BCEuler02的第2个clickhouse-server进程

第二分片

<shard>

<internal_replication>true</internal_replication>

<weight>1</weight>

<replica>

<host>BCEuler02</host>

<port>9001</port>

</replica>

<replica>

<host>BCEuler01</host>

<port>9002</port>

</replica>

</shard>

BCEuler01的第2个clickhouse-server进程 和 BCEuler02的第1个clickhouse-server进程

总结:

BCEuler01上 :

第1个clickhouse-server进程为第1分片

第2个clickhouse-server进程为第2分片

BCEuler02上 :

第1个clickhouse-server进程为第2分片

第2个clickhouse-server进程为第1分片

3.2.2.1.3 集群relica配置

<relica>

</relica>

代表一个副本,每个分片有2个副本,所以有两个这个结构

第1个分片有2个副本:

<replica>

<host>BCEuler01</host>

<port>9001</port>

</replica>

<replica>

<host>BCEuler02</host>

<port>9002</port>

</replica>

第2个分片有2个副本:

<replica>

<host>BCEuler02</host>

<port>9001</port>

</replica>

<replica>

<host>BCEuler01</host>

<port>9002</port>

</replica>

3.2.2.1.4 zookeeper配置

<zookeeper>

<node>

<host>192.168.122.100</host>

<port>2181</port>

</node>

<node>

<host>192.168.122.101</host>

<port>2181</port>

</node>

<node>

<host>192.168.122.102</host>

<port>2181</port>

</node>

</zookeeper>

3.2.2.1.5 macros配置

<macros>

<replica>BCEuler0101</replica>

<shard>01</shard>

</macros>

3.2.2.1.6 其它部分配置

<remote_servers>

<!-- Test only shard config for testing distributed storage -->

<ck_2shard_2replica_cluster>

.....

</ck_2shard_2replica_cluster>

</remote_servers>

ck_2shard_2replica_cluster集群名称,在使用create语句创建库和表时,需要加上 on cluster ck_2shard_2replica_cluster

<shard>

<internal_replication>true</internal_replication>

<weight>1</weight>

</shard>

internal_replication 开启数据同步的方式

<weight>1</weight> 分片之间数据分配的权重

3.2.2.2节点1(BCEuler01 config2.xml)

只列出不同部分:

<http_port>8124</http_port>

<tcp_port>9002</tcp_port>

<mysql_port>9014</mysql_port>

<postgresql_port>9015</postgresql_port>

<interserver_http_port>9019</interserver_http_port>

<macros>

<replica>BCEuler0102</replica>

<shard>02</shard>

</macros>

3.2.2.3 节点2(BCEuler02 config.xml)

<http_port>8123</http_port>

<tcp_port>9001</tcp_port>

<mysql_port>9004</mysql_port>

<postgresql_port>9005</postgresql_port>

<interserver_http_port>9009</interserver_http_port>

<macros>

<replica>BCEuler0101</replica>

<shard>02</shard>

</macros>

3.2.2.4 节点2(BCEuler02 config2.xml)

<http_port>8124</http_port>

<tcp_port>9002</tcp_port>

<mysql_port>9014</mysql_port>

<postgresql_port>9015</postgresql_port>

<interserver_http_port>9009</interserver_http_port>

<macros>

<replica>BCEuler0102</replica>

<shard>01</shard>

</macros>

3.2.2.5 说明

对于macros中shard和relica的定义,逻辑如下:

shard: 如果是第1个分片,那shard为01, 第2个分片shard为02

relica:如果是第1个副本,那取 "主机名01",第2个副本,那取"主机名02"

3.3启动进程(2个节点同样操作)

第一个分片:

nohup clickhouse-server --config-file /etc/clickhouse-server/config.xml &

第2个分片:

nohup clickhouse-server --config-file /etc/clickhouse-server/config2.xml

4 查看click-house日志

tail -f >/var/log/clickhouse-server/clickhouse-server.log

5 ODBC for PostgreSQL 配置

5.1 安装unixODBC

yum install unixODBC-devel unixODBC

5.2 安装psqlodbc-15.00.0000

从postgresql官网去找到包下载

源码编译安装psqlodbc主要是为了获取psqlodbcw.so和libodbcpsqlS.so

编译安装后,将2个.so包放到/usr/lib64目录下

cp /usr/local/lib/psqlodbcw.so /usr/lib64/

cp /usr/local/lib/libodbcpsqlS.so /usr/lib64/

5.3 配置odbc.ini和odbcinst.ini

odbc.ini odbcinst.ini

5.3.1 配置odbcinst.ini

more /etc/odbcinst.ini

Example driver definitions

Driver from the postgresql-odbc package

Setup from the unixODBC package

[PostgreSQL]

Description = ODBC for PostgreSQL

Driver = /usr/lib/psqlodbcw.so

Setup = /usr/lib/libodbcpsqlS.so

Driver64 = /usr/lib64/psqlodbcw.so

Setup64 = /usr/lib64/libodbcpsqlS.so

FileUsage = 1

Driver from the mysql-connector-odbc package

Setup from the unixODBC package

[MySQL]

Description = ODBC for MySQL

Driver = /usr/lib/libmyodbc5.so

Setup = /usr/lib/libodbcmyS.so

Driver64 = /usr/lib64/libmyodbc5.so

Setup64 = /usr/lib64/libodbcmyS.so

FileUsage = 1

5.3.2 配置odbc.ini

more /etc/odbc.ini

注: [postgresql] 为数据库连接字符串,类似oracle的database link名称

[DEFAULT]

Driver = postgresql

[postgresql]

Description = GP

Driver = PostgreSQL

Database = gpdbtest

Servername = 192.168.122.102

UserName = gpadmin

Password = 1233455

Port = 5432

ReadOnly = 0

ConnSettings = set client_encoding to UTF8

5.3.3 验证odbc是否正确

查看是否可以连接到远程的greenplum数据库:

isql postgresql gpadmin 1233455

会有Connected字样!

6 连接GP数据库

6.1 创建dictionary,使用数据库连接字符串postgresql

CREATE DICTIONARY on cluster ck_2shard_2replica_cluster test(

`id` Int64 DEFAULT 0,

`name` String DEFAULT 'a'

)

PRIMARY KEY id

SOURCE(ODBC(CONNECTION_STRING 'DSN=postgresql' TABLE 'test_data'))

LIFETIME(MIN 180 MAX 240)

LAYOUT(HASHED())

postgresq为database link名称,在/etc/odbc.ini配置

6.2 在clickhouse库中查询greenplum库数据

clickhouse-client --port=9001 --user=default

输入default用户的密码

BCEuler01 :) select * from test limit 10;

如果可以查出数据,说明连接没有问题

6.3 odbc说明

6.3.1 如greenplum的gpadmin用户密码包含特殊字段(%_@等)

在/etc/odbc.ini配置密码后,会提示密码错误,对于特殊字符的密码如果配置,还未解决

6.3.2 在odbc配置中增加日志和日志级别,可以查看具体问题

(未完待更)

相关推荐
勤奋的凯尔森同学29 分钟前
webmin配置终端显示样式,模仿UbuntuDesktop终端
linux·运维·服务器·ubuntu·webmin
技术小齐5 小时前
网络运维学习笔记 016网工初级(HCIA-Datacom与CCNA-EI)PPP点对点协议和PPPoE以太网上的点对点协议(此处只讲华为)
运维·网络·学习
ITPUB-微风5 小时前
Service Mesh在爱奇艺的落地实践:架构、运维与扩展
运维·架构·service_mesh
落幕5 小时前
C语言-进程
linux·运维·服务器
chenbin5206 小时前
Jenkins 自动构建Job
运维·jenkins
java 凯6 小时前
Jenkins插件管理切换国内源地址
运维·jenkins
AI服务老曹6 小时前
运用先进的智能算法和优化模型,进行科学合理调度的智慧园区开源了
运维·人工智能·安全·开源·音视频
sszdzq7 小时前
Docker
运维·docker·容器
book01217 小时前
MySql数据库运维学习笔记
运维·数据库·mysql
bugtraq20218 小时前
XiaoMi Mi5(gemini) 刷入Ubuntu Touch 16.04——安卓手机刷入Linux
linux·运维·ubuntu