clickhouse集群搭建

Clickhouse集群搭建

文章目录

安装包下载

可以通过deb/rpm、tgz方式安装,但是离线安装选择tgz方式比较方便。本篇介绍

如何离线安装。

tgz下载:https://packages.clickhouse.com/tgz/lts

clickhouse单机安装

搭建集群需要在每台主机上进行单机安装,将tgz和install.sh上传到服务器,执行:

bash 复制代码
bash install.sh
bash 复制代码
#!/bin/bash

export LATEST_VERSION=22.3.19.6
export ARCH=arm64

tar -xzvf "clickhouse-common-static-$LATEST_VERSION-${ARCH}.tgz" \
  || tar -xzvf "clickhouse-common-static-$LATEST_VERSION.tgz"
sudo "clickhouse-common-static-$LATEST_VERSION/install/doinst.sh"

tar -xzvf "clickhouse-common-static-dbg-$LATEST_VERSION-${ARCH}.tgz" \
  || tar -xzvf "clickhouse-common-static-dbg-$LATEST_VERSION.tgz"
sudo "clickhouse-common-static-dbg-$LATEST_VERSION/install/doinst.sh"

tar -xzvf "clickhouse-server-$LATEST_VERSION-${ARCH}.tgz" \
  || tar -xzvf "clickhouse-server-$LATEST_VERSION.tgz"
sudo "clickhouse-server-$LATEST_VERSION/install/doinst.sh" configure
sudo /etc/init.d/clickhouse-server start

tar -xzvf "clickhouse-client-$LATEST_VERSION-${ARCH}.tgz" \
  || tar -xzvf "clickhouse-client-$LATEST_VERSION.tgz"
sudo "clickhouse-client-$LATEST_VERSION/install/doinst.sh"

默认安装

默认数据库目录
  • 默认数据目录路径: 默认数据目录处于/var/lib/clickhouse
  • 数据目录初始化:删除此目录,重新启动clickhouse systemctl restart clickhouse-server可以重新将数据目录初始化。
更改默认数据目录

要更改默认数据目录更改,修改/etc/clickhouse-server/config.xml

xml 复制代码
    <!-- Path to data directory, with trailing slash. -->
    <path>/var/lib/clickhouse/</path>

配置文件是只读的,所以要修改数据目录,

  1. 需要修改其权限
bash 复制代码
chmod 755 /etc/clickhouse-server/config.xml
  1. 将全部/var/lib/clickhouse/修改为其他数据目录
  2. 删除默认数据库目录下所有文件。
  3. 重启clickhouse-server,对数据目录进行重新初始化。

2分片-1副本-3节点集群搭建

clickhouse的集群每个分片的每个副本只能放到单独的实例上,比如2分片-2副本需要4台机器,3分片-2副本需要6台机器。

集群的拓扑:

节点 角色 描述
chnode1 Data + ClickHouse Keeper 数据节点 + 协调器
chnode2 Data + ClickHouse Keeper 数据节点 + 协调器
chnode3 ClickHouse Keeper 协调器

分片及副本分布:

节点 角色
chnode1 分片1、副本1
chnode2 分片2、副本1
chnode3 无分片、无副本

1. 配置hosts

对于每个主机,将如下内容追加到/etc/hosts

bash 复制代码
10.55.134.82 chnode1     
10.55.134.93 chnode2
10.55.134.99 chnode3

2. 修改每个主机的主机名

bash 复制代码
hostnamectl set-hostname chnode1

3. 配置文件上传

配置文件分布

/etc/clickhouse-server/config.d/目录中的配置会覆盖默认配置,所以官网建议:

  • 服务器配置添加到/etc/clickhouse-server/config.d/
  • 用户配置添加到/etc/clickhouse-server/users.d/
  • 不要更改/etc/clickhouse-server/config.xml
  • 不要更改/etc/clickhouse-server/users.xml

我们先来看看每个主机上的配置文件:

  • chnode1
bash 复制代码
[root@chnode1 clickhouse]# ll /etc/clickhouse-server/config.d/
total 24
-rw-r--r-- 1 root       root       985 Feb 13 15:41 enable-keeper.xml
-rw-r--r-- 1 clickhouse clickhouse  66 Feb 13 16:19 listen.xml
-rw-r--r-- 1 root       root       104 Feb 13 15:41 macros.xml
-rw-r--r-- 1 root       root       574 Feb 13 16:52 network-and-logging.xml
-rw-r--r-- 1 root       root       599 Feb 13 15:41 remote-servers.xml
-rw-r--r-- 1 root       root       386 Feb 13 15:41 use-keeper.xml
  • chnode2
bash 复制代码
[root@chnode2 clickhouse]# ll /etc/clickhouse-server/config.d/
total 24
-rw-r--r-- 1 root       root       985 Feb 13 15:41 enable-keeper.xml
-rw-r--r-- 1 clickhouse clickhouse  66 Feb 13 16:19 listen.xml
-rw-r--r-- 1 root       root       104 Feb 13 15:41 macros.xml
-rw-r--r-- 1 root       root       574 Feb 13 16:52 network-and-logging.xml
-rw-r--r-- 1 root       root       599 Feb 13 15:41 remote-servers.xml
-rw-r--r-- 1 root       root       386 Feb 13 15:41 use-keeper.xml
  • chnode3
bash 复制代码
[root@chnode3 clickhouse]# ll /etc/clickhouse-server/config.d/
total 12
-rw-r--r-- 1 clickhouse clickhouse 985 Feb 13 15:45 enable-keeper.xml
-rw-r--r-- 1 clickhouse clickhouse  61 Feb 13 16:06 listen.xml
-rw-r--r-- 1 clickhouse clickhouse 566 Feb 13 15:45 network-and-logging.xml
chnode1配置文件
  • network-and-logging.xml

    • 日志在1000M大小时滚动一次,保留3000M的日志。
    • clickhouse监听8123和9000端口
    • 服务器间通信使用端口9009
xml 复制代码
<clickhouse>
        <logger>
                <level>debug</level>
                <log>/var/log/clickhouse-server/clickhouse-server.log</log>
                <errorlog>/var/log/clickhouse-server/clickhouse-server.err.log</errorlog>
                <size>1000M</size>
                <count>3</count>
        </logger>
        <display_name>clickhouse</display_name>
        <listen_host>0.0.0.0</listen_host>
        <http_port>8123</http_port>
        <tcp_port>9000</tcp_port>
        <interserver_http_port>9009</interserver_http_port>
</clickhouse>
  • enable-keeper.xml
    • chnode节点的server_id设置为1,其他节点id要不同。
    • 其他配置和chnode2一样
xml 复制代码
<clickhouse>
  <keeper_server>
    <tcp_port>9181</tcp_port>
    <server_id>1</server_id>
    <log_storage_path>/var/lib/clickhouse/coordination/log</log_storage_path>
    <snapshot_storage_path>/var/lib/clickhouse/coordination/snapshots</snapshot_storage_path>

    <coordination_settings>
        <operation_timeout_ms>10000</operation_timeout_ms>
        <session_timeout_ms>30000</session_timeout_ms>
        <raft_logs_level>trace</raft_logs_level>
    </coordination_settings>

    <raft_configuration>
        <server>
            <id>1</id>
            <hostname>chnode1</hostname>
            <port>9234</port>
        </server>
        <server>
            <id>2</id>
            <hostname>chnode2</hostname>
            <port>9234</port>
        </server>
        <server>
            <id>3</id>
            <hostname>chnode3</hostname>
            <port>9234</port>
        </server>
    </raft_configuration>
  </keeper_server>
</clickhouse>
  • macros.xml
    • shard值为1,指定了本节点存储分片1,副本1,chnode2里shard的值将变为2
    • 这种指定方式可以减少DDL语句复杂度,不用在建表时候再去指定分片分配到哪个节点。
xml 复制代码
<clickhouse>
  <macros>
    <shard>1</shard>
    <replica>replica_1</replica>
  </macros>
</clickhouse>
  • remote-servers.xml
    • remote-servers部分指定了所有集群,replace="true"表示覆盖默认配置里配置的集群。
    • 指定了一个集群名为cluster_2S_1R
    • 集群使用secret进行加密通信
    • cluster_2S_1R集群有两个分片,每个分片有一个副本。
    • internal_replication设置为true表示写入操作时会选择第一个发现的健康副本去写入。

chnode1和chnode2的remote-servers.xml配置相同。

xml 复制代码
<clickhouse>
  <remote_servers replace="true">
    <cluster_2S_1R>
    <secret>mysecretphrase</secret>
        <shard>
            <internal_replication>true</internal_replication>
            <replica>
                <host>chnode1</host>
                <port>9000</port>
            </replica>
        </shard>
        <shard>
            <internal_replication>true</internal_replication>
            <replica>
                <host>chnode2</host>
                <port>9000</port>
            </replica>
        </shard>
    </cluster_2S_1R>
  </remote_servers>
</clickhouse>
  • use-keeper.xml
    • 指定3个节点的zookeeper端口为9181
xml 复制代码
<clickhouse>
    <zookeeper>
        <node index="1">
            <host>chnode1</host>
            <port>9181</port>
        </node>
        <node index="2">
            <host>chnode2</host>
            <port>9181</port>
        </node>
        <node index="3">
            <host>chnode3</host>
            <port>9181</port>
        </node>
    </zookeeper>
</clickhouse>
chnode2配置文件
  • network-and-logging.xml

和chnode1配置相同

xml 复制代码
<clickhouse>
        <logger>
                <level>debug</level>
                <log>/var/log/clickhouse-server/clickhouse-server.log</log>
                <errorlog>/var/log/clickhouse-server/clickhouse-server.err.log</errorlog>
                <size>1000M</size>
                <count>3</count>
        </logger>
        <display_name>clickhouse</display_name>
        <listen_host>0.0.0.0</listen_host>
        <http_port>8123</http_port>
        <tcp_port>9000</tcp_port>
        <interserver_http_port>9009</interserver_http_port>
</clickhouse>
  • enable-keeper.xml
    • chnode2节点的server_id设置为2,其他相同
xml 复制代码
<clickhouse>
  <keeper_server>
    <tcp_port>9181</tcp_port>
    <server_id>2</server_id>
    <log_storage_path>/var/lib/clickhouse/coordination/log</log_storage_path>
    <snapshot_storage_path>/var/lib/clickhouse/coordination/snapshots</snapshot_storage_path>

    <coordination_settings>
        <operation_timeout_ms>10000</operation_timeout_ms>
        <session_timeout_ms>30000</session_timeout_ms>
        <raft_logs_level>trace</raft_logs_level>
    </coordination_settings>

    <raft_configuration>
        <server>
            <id>1</id>
            <hostname>chnode1</hostname>
            <port>9234</port>
        </server>
        <server>
            <id>2</id>
            <hostname>chnode2</hostname>
            <port>9234</port>
        </server>
        <server>
            <id>3</id>
            <hostname>chnode3</hostname>
            <port>9234</port>
        </server>
    </raft_configuration>
  </keeper_server>
</clickhouse>
  • macros.xml
    • shard值为2,指定了本节点存储分片2,副本1
xml 复制代码
<clickhouse>
  <macros>
    <shard>2</shard>
    <replica>replica_1</replica>
  </macros>
</clickhouse>
  • remote-servers.xml

chnode1和chnode2的remote-servers.xml配置相同。

xml 复制代码
<clickhouse>
  <remote_servers replace="true">
    <cluster_2S_1R>
    <secret>mysecretphrase</secret>
        <shard>
            <internal_replication>true</internal_replication>
            <replica>
                <host>chnode1</host>
                <port>9000</port>
            </replica>
        </shard>
        <shard>
            <internal_replication>true</internal_replication>
            <replica>
                <host>chnode2</host>
                <port>9000</port>
            </replica>
        </shard>
    </cluster_2S_1R>
  </remote_servers>
</clickhouse>
  • use-keeper.xml

和chnode1配置相同

xml 复制代码
<clickhouse>
    <zookeeper>
        <node index="1">
            <host>chnode1</host>
            <port>9181</port>
        </node>
        <node index="2">
            <host>chnode2</host>
            <port>9181</port>
        </node>
        <node index="3">
            <host>chnode3</host>
            <port>9181</port>
        </node>
    </zookeeper>
</clickhouse>
chnode3配置文件
  • network-and-logging.xml

和chnode1配置相同

xml 复制代码
<clickhouse>
        <logger>
                <level>debug</level>
                <log>/var/log/clickhouse-server/clickhouse-server.log</log>
                <errorlog>/var/log/clickhouse-server/clickhouse-server.err.log</errorlog>
                <size>1000M</size>
                <count>3</count>
        </logger>
        <display_name>clickhouse</display_name>
        <listen_host>0.0.0.0</listen_host>
        <http_port>8123</http_port>
        <tcp_port>9000</tcp_port>
        <interserver_http_port>9009</interserver_http_port>
</clickhouse>
  • enable-keeper.xml
    • chnode3节点的server_id设置为3,其他相同
xml 复制代码
<clickhouse>
  <keeper_server>
    <tcp_port>9181</tcp_port>
    <server_id>3</server_id>
    <log_storage_path>/var/lib/clickhouse/coordination/log</log_storage_path>
    <snapshot_storage_path>/var/lib/clickhouse/coordination/snapshots</snapshot_storage_path>

    <coordination_settings>
        <operation_timeout_ms>10000</operation_timeout_ms>
        <session_timeout_ms>30000</session_timeout_ms>
        <raft_logs_level>trace</raft_logs_level>
    </coordination_settings>

    <raft_configuration>
        <server>
            <id>1</id>
            <hostname>chnode1</hostname>
            <port>9234</port>
        </server>
        <server>
            <id>2</id>
            <hostname>chnode2</hostname>
            <port>9234</port>
        </server>
        <server>
            <id>3</id>
            <hostname>chnode3</hostname>
            <port>9234</port>
        </server>
    </raft_configuration>
  </keeper_server>
</clickhouse>

4. 重启clickhouse-server

三台服务器全部执行:

bash 复制代码
systemctl restart clickhouse-server

5. 集群搭建测试

下面来创建样例分布表,测试下效果:

  1. 连接chnode1,执行SHOW CLUSTERS
bash 复制代码
[root@chnode1 clickhouse]# clickhouse-client --password -h 127.0.0.1
ClickHouse client version 22.3.19.6 (official build).
Password for user (default): 
Connecting to 127.0.0.1:9000 as user default.
Connected to ClickHouse server version 22.3.19 revision 54455.

clickhouse :) SHOW CLUSTERS

SHOW CLUSTERS

Query id: 0ff16c63-3c1e-438d-8fad-1e8e25c42235

┌─cluster───────┐
│ cluster_2S_1R │
└───────────────┘
  1. 创建数据库
sql 复制代码
CREATE DATABASE db1 ON CLUSTER cluster_2S_1R
  1. 创建一个分布表
sql 复制代码
CREATE TABLE db1.table1_dist ON CLUSTER cluster_2S_1R
(
    `id` UInt64,
    `column1` String
)
ENGINE = Distributed('cluster_2S_1R', 'db1', 'table1', rand())
  1. 分别clickhouse-client连接chnode1和chnode2.

chnode1插入:

sql 复制代码
INSERT INTO db1.table1 (id, column1) VALUES (1, 'abc');

chnode2插入:

sql 复制代码
INSERT INTO db1.table1 (id, column1) VALUES (2, 'def');
  1. 查询数据
sql 复制代码
clickhouse :) SELECT * FROM db1.table1_dist;

SELECT *
FROM db1.table1_dist

Query id: 8ce26016-f923-472e-894d-a7a3025a8927

┌─id─┬─column1─┐
│  1 │ abc     │
└────┴─────────┘
┌─id─┬─column1─┐
│  1 │ abc     │
└────┴─────────┘

Reference List

  1. https://clickhouse.com/docs/en/architecture/horizontal-scaling
相关推荐
余衫马4 小时前
CentOS7 离线安装 Postgresql 指南
数据库·postgresql
E___V___E4 小时前
MySQL数据库入门到大蛇尚硅谷宋红康老师笔记 高级篇 part 2
数据库·笔记·mysql
m0_748254885 小时前
mysql之如何获知版本
数据库·mysql
mikey棒棒棒5 小时前
Redis——优惠券秒杀问题(分布式id、一人多单超卖、乐悲锁、CAS、分布式锁、Redisson)
数据库·redis·lua·redisson·watchdog·cas·并发锁
水手胡巴6 小时前
oracle apex post接口
数据库·oracle
史迪仔01129 小时前
【SQL】SQL多表查询
数据库·sql
Quz9 小时前
MySQL:修改数据库默认存储目录与数据迁移
数据库·mysql
Familyism9 小时前
Redis
数据库·redis·缓存
隔壁老登9 小时前
查询hive指定数据库下所有表的建表语句并生成数据字典
数据库·hive·hadoop
sekaii10 小时前
ReDistribution plan细节
linux·服务器·数据库