1 搭建环境
192.168.122.100 BCEuler01
192.168.122.101 BCEuler02
192.168.122.102 BCEuler03
1.1 关闭防火墙
systemctl disable firewalld
systemctl stop firewalld
1.2 关闭selinux
/etc/selinux/config
SELINUX=disabled
1.3 设置时间同步(chrony)
将BCEuler01设置为时钟同步源
BEEuler01的/etc/chrony.conf:
修改的部分:
server 192.168.122.100 iburst
allow 192.168.0.0/16
local stratum 10
BCEuler02的/etc/chrony.conf:
pool 192.168.122.100 iburst
BCEuler03的/etc/chrony.conf:
pool 192.168.122.100 iburst
启动:
systemctl enable chronyd
systemctl start chronyd
查看同步情况:
#chronyc sources v
2 搭建zokeeper集群
三台服务器搭建一个zookeeper集群
2.1 从官网下载apache-zookeeper-3.8.2-bin
3.9版本与clickhouse不匹配,会有问题
2.2 将apache-zookeeper-3.8.2-bin上传
三台服务器上都要上传到/opt目录下
2.3 配置zookeeper(zoo.cfg)
cd /opt/apache-zookeeper-3.8.2-bin/conf/
cp zoo_sample.cfg zoo.cfg
三台服务器配置相同,如下:
more zoo.cfg
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/data/zookeeper/data
clientPort=2181
server.0=192.168.122.100:2888:3888
server.1=192.168.122.101:2888:3888
server.2=192.168.122.102:2888:3888
MaxSessionTimeout=120000
forceSync=no
autopurge.snapRetainCount=20
autopurge.purgeInterval=48
4lw.commands.whitelist=*
2.4 在三台服务器上创建目录
mkdir /data/zookeeper/data
2.5 不同服务器写入不同的myid文件
2.5.1 在BCEuler01(192.168.122.100)
echo 0 > /data/zookeeper/data/myid
2.5.2 在BCEuler02(192.168.122.101)
echo 1 > /data/zookeeper/data/myid
2.5.3 在BCEuler03(192.168.122.102)
echo 2 > /data/zookeeper/data/myid
2.6 启动zookeeper (三台服务器上都如此操作)
cd /opt/apache-zookeeper-3.8.2-bin
nohup bin/zkServer.sh start &
2.7 查看zookeeper状态
2.7.1 方法1
bin/zkServer.sh status
2.7.2 方法2
(此方法需要在zoo.cfg中增加 4lw.commands.whitelist=* ):
echo status | nc 192.168.122.100 2181
echo status | nc 192.168.122.101 2181
echo status | nc 192.168.122.102 2181
3.搭建clickhouse集群
集群规划
每个节点有2个分片,每个分片有2个副本;
1个分片就是在节点上启动1个clickhouse-server进程,
2个分片就是在节点上启动2个clickhouse-server进程,2个进程使用不同的端口,不同的配置文件 ;
节点:
BCEuler01
第一个分片配置config.xml,tcp_port 9001
第二个分片配置config2.xml, tcp_port 9002
BCEuler02
第一个分片配置config.xml,tcp_port 9001
第二个分片配置config2.xml, tcp_port 9002
3.1从官网下载,使用rpm 命令安装(二个节点都需要安装)
clickhouse-common-static-23.7.4.5.x86_64.rpm
clickhouse-client-23.7.4.5.x86_64.rpm
clickhouse-server-23.7.4.5.x86_64.rpm
3.2配置clickhouse
cd /etc/clickhouse-server/
3.2.1设置default用户密码(二个节点都需要如下操作)
修改users.d/default-password.xml文件,设置密码
3.2.2配置config.xml和config2.xml
因为一台节点上要有2个分片就是启2个clickhouse-server进程,所以要有两个config.xml(二个节点都需要如下操作)
cp config.xml config2.xml
配置文件(下面列出4个配置文件 ,4个配置文件内容基本相同,红字黄底的不一样的部分, 其它都相同,红字的为重要部署):
3.2.2.1节点1(BCEuler01 config.xml)
(只列出了修改的部分)
<clickhouse>
<logger>
<level>test</level>
<log>/var/log/clickhouse-server/clickhouse-server.log</log>
<errorlog>/var/log/clickhouse-server/clickhouse-server.err.log</errorlog>
</logger>
<http_port>8123</http_port>
<tcp_port>9001</tcp_port>
<mysql_port>9004</mysql_port>
<postgresql_port>9005</postgresql_port>
<interserver_http_port>9009</interserver_http_port>
<!-- Path to data directory, with trailing slash. -->
<path>/data/clickhouse/</path>
<!-- Path to temporary data for processing hard queries. -->
<tmp_path>/data/clickhouse/tmp/</tmp_path>
<!-- Directory with user provided files that are accessible by 'file' table function. -->
<user_files_path>/data/clickhouse/user_files/</user_files_path>
<!-- Path to folder where users created by SQL commands are stored. -->
<path>/data/clickhouse/access/</path>
<remote_servers>
<!-- Test only shard config for testing distributed storage -->
<ck_2shard_2replica_cluster>
<shard>
<internal_replication>true</internal_replication>
<weight>1</weight>
<replica>
<host>BCEuler01</host>
<port>9001</port>
</replica>
<replica>
<host>BCEuler02</host>
<port>9002</port>
</replica>
</shard>
<shard>
<internal_replication>true</internal_replication>
<weight>1</weight>
<replica>
<host>BCEuler02</host>
<port>9001</port>
</replica>
<replica>
<host>BCEuler01</host>
<port>9002</port>
</replica>
</shard>
</ck_2shard_2replica_cluster>
</remote_servers>
<zookeeper>
<node>
<host>192.168.122.100</host>
<port>2181</port>
</node>
<node>
<host>192.168.122.101</host>
<port>2181</port>
</node>
<node>
<host>192.168.122.102</host>
<port>2181</port>
</node>
</zookeeper>
<macros>
<replica>BCEuler0101</replica>
<shard>01</shard>
</macros>
<!-- Directory in <clickhouse-path> containing schema files for various input formats.
The directory will be created if it doesn't exist.
-->
<format_schema_path>/data/clickhouse/format_schemas/</format_schema_path>
-->
</clickhouse>
解释:
3.2.2.1.1 端口配置
同一个节点的config.xml和config2.xm中配置的不同,(同一个端口只被一个进程使用,所以config2.xml要更换端口), tcp_port在配置集群时要用到;;
<http_port>8123</http_port>
<tcp_port>9001</tcp_port>
<mysql_port>9004</mysql_port>
<postgresql_port>9005</postgresql_port>
<interserver_http_port>9009</interserver_http_port>
3.2.2.1.2 集群shard配置
<remote_servers>
<!-- Test only shard config for testing distributed storage -->
<ck_2shard_2replica_cluster>
<shard>
<internal_replication>true</internal_replication>
<weight>1</weight>
<replica>
<host>BCEuler01</host>
<port>9001</port>
</replica>
<replica>
<host>BCEuler02</host>
<port>9002</port>
</replica>
</shard>
<shard>
<internal_replication>true</internal_replication>
<weight>1</weight>
<replica>
<host>BCEuler02</host>
<port>9001</port>
</replica>
<replica>
<host>BCEuler01</host>
<port>9002</port>
</replica>
</shard>
</ck_2shard_2replica_cluster>
</remote_servers>
配置:
<shard>
.....
</shard>
代表一个分片,因为每个节点2个分片,所以会有两个这个结构
第一分片:
<shard>
<internal_replication>true</internal_replication>
<weight>1</weight>
<replica>
<host>BCEuler01</host>
<port>9001</port>
</replica>
<replica>
<host>BCEuler02</host>
<port>9002</port>
</replica>
</shard>
BCEuler01的第1个clickhouse-server进程 和 BCEuler02的第2个clickhouse-server进程
第二分片
<shard>
<internal_replication>true</internal_replication>
<weight>1</weight>
<replica>
<host>BCEuler02</host>
<port>9001</port>
</replica>
<replica>
<host>BCEuler01</host>
<port>9002</port>
</replica>
</shard>
BCEuler01的第2个clickhouse-server进程 和 BCEuler02的第1个clickhouse-server进程
总结:
BCEuler01上 :
第1个clickhouse-server进程为第1分片
第2个clickhouse-server进程为第2分片
BCEuler02上 :
第1个clickhouse-server进程为第2分片
第2个clickhouse-server进程为第1分片
3.2.2.1.3 集群relica配置
<relica>
</relica>
代表一个副本,每个分片有2个副本,所以有两个这个结构
第1个分片有2个副本:
<replica>
<host>BCEuler01</host>
<port>9001</port>
</replica>
<replica>
<host>BCEuler02</host>
<port>9002</port>
</replica>
第2个分片有2个副本:
<replica>
<host>BCEuler02</host>
<port>9001</port>
</replica>
<replica>
<host>BCEuler01</host>
<port>9002</port>
</replica>
3.2.2.1.4 zookeeper配置
<zookeeper>
<node>
<host>192.168.122.100</host>
<port>2181</port>
</node>
<node>
<host>192.168.122.101</host>
<port>2181</port>
</node>
<node>
<host>192.168.122.102</host>
<port>2181</port>
</node>
</zookeeper>
3.2.2.1.5 macros配置
<macros>
<replica>BCEuler0101</replica>
<shard>01</shard>
</macros>
3.2.2.1.6 其它部分配置
<remote_servers>
<!-- Test only shard config for testing distributed storage -->
<ck_2shard_2replica_cluster>
.....
</ck_2shard_2replica_cluster>
</remote_servers>
ck_2shard_2replica_cluster集群名称,在使用create语句创建库和表时,需要加上 on cluster ck_2shard_2replica_cluster
<shard>
<internal_replication>true</internal_replication>
<weight>1</weight>
</shard>
internal_replication 开启数据同步的方式
<weight>1</weight> 分片之间数据分配的权重
3.2.2.2节点1(BCEuler01 config2.xml)
只列出不同部分:
<http_port>8124</http_port>
<tcp_port>9002</tcp_port>
<mysql_port>9014</mysql_port>
<postgresql_port>9015</postgresql_port>
<interserver_http_port>9019</interserver_http_port>
<macros>
<replica>BCEuler0102</replica>
<shard>02</shard>
</macros>
3.2.2.3 节点2(BCEuler02 config.xml)
<http_port>8123</http_port>
<tcp_port>9001</tcp_port>
<mysql_port>9004</mysql_port>
<postgresql_port>9005</postgresql_port>
<interserver_http_port>9009</interserver_http_port>
<macros>
<replica>BCEuler0101</replica>
<shard>02</shard>
</macros>
3.2.2.4 节点2(BCEuler02 config2.xml)
<http_port>8124</http_port>
<tcp_port>9002</tcp_port>
<mysql_port>9014</mysql_port>
<postgresql_port>9015</postgresql_port>
<interserver_http_port>9009</interserver_http_port>
<macros>
<replica>BCEuler0102</replica>
<shard>01</shard>
</macros>
3.2.2.5 说明
对于macros中shard和relica的定义,逻辑如下:
shard: 如果是第1个分片,那shard为01, 第2个分片shard为02
relica:如果是第1个副本,那取 "主机名01",第2个副本,那取"主机名02"
3.3启动进程(2个节点同样操作)
第一个分片:
nohup clickhouse-server --config-file /etc/clickhouse-server/config.xml &
第2个分片:
nohup clickhouse-server --config-file /etc/clickhouse-server/config2.xml
4 查看click-house日志
tail -f >/var/log/clickhouse-server/clickhouse-server.log
5 ODBC for PostgreSQL 配置
5.1 安装unixODBC
yum install unixODBC-devel unixODBC
5.2 安装psqlodbc-15.00.0000
从postgresql官网去找到包下载
源码编译安装psqlodbc主要是为了获取psqlodbcw.so和libodbcpsqlS.so
编译安装后,将2个.so包放到/usr/lib64目录下
cp /usr/local/lib/psqlodbcw.so /usr/lib64/
cp /usr/local/lib/libodbcpsqlS.so /usr/lib64/
5.3 配置odbc.ini和odbcinst.ini
odbc.ini odbcinst.ini
5.3.1 配置odbcinst.ini
more /etc/odbcinst.ini
Example driver definitions
Driver from the postgresql-odbc package
Setup from the unixODBC package
[PostgreSQL]
Description = ODBC for PostgreSQL
Driver = /usr/lib/psqlodbcw.so
Setup = /usr/lib/libodbcpsqlS.so
Driver64 = /usr/lib64/psqlodbcw.so
Setup64 = /usr/lib64/libodbcpsqlS.so
FileUsage = 1
Driver from the mysql-connector-odbc package
Setup from the unixODBC package
[MySQL]
Description = ODBC for MySQL
Driver = /usr/lib/libmyodbc5.so
Setup = /usr/lib/libodbcmyS.so
Driver64 = /usr/lib64/libmyodbc5.so
Setup64 = /usr/lib64/libodbcmyS.so
FileUsage = 1
5.3.2 配置odbc.ini
more /etc/odbc.ini
注: [postgresql] 为数据库连接字符串,类似oracle的database link名称
[DEFAULT]
Driver = postgresql
[postgresql]
Description = GP
Driver = PostgreSQL
Database = gpdbtest
Servername = 192.168.122.102
UserName = gpadmin
Password = 1233455
Port = 5432
ReadOnly = 0
ConnSettings = set client_encoding to UTF8
5.3.3 验证odbc是否正确
查看是否可以连接到远程的greenplum数据库:
isql postgresql gpadmin 1233455
会有Connected字样!
6 连接GP数据库
6.1 创建dictionary,使用数据库连接字符串postgresql
CREATE DICTIONARY on cluster ck_2shard_2replica_cluster test(
`id` Int64 DEFAULT 0,
`name` String DEFAULT 'a'
)
PRIMARY KEY id
SOURCE(ODBC(CONNECTION_STRING 'DSN=postgresql' TABLE 'test_data'))
LIFETIME(MIN 180 MAX 240)
LAYOUT(HASHED())
postgresq为database link名称,在/etc/odbc.ini配置
6.2 在clickhouse库中查询greenplum库数据
clickhouse-client --port=9001 --user=default
输入default用户的密码
BCEuler01 :) select * from test limit 10;
如果可以查出数据,说明连接没有问题
6.3 odbc说明
6.3.1 如greenplum的gpadmin用户密码包含特殊字段(%_@等)
在/etc/odbc.ini配置密码后,会提示密码错误,对于特殊字符的密码如果配置,还未解决
6.3.2 在odbc配置中增加日志和日志级别,可以查看具体问题
(未完待更)