作者:俊达
介绍
clickhouse是一款开源的分析型数据库,性能强大。本文介绍如何在K8S环境中部署和使用clickhouse。
我们使用开源的clickhouse operator: https://github.com/Altinity/clickhouse-operator
相关依赖:
-
k8s 1.15+。我们使用了k8s 1.20。8C16G 3节点。
-
存储CSI。我们使用了nfs。nfs仅用作测试,生产环境不建议使用nfs作为数据库存储
安装部署
1、下载operator相关代码
wget https://github.com/Altinity/clickhouse-operator/archive/refs/tags/0.18.4.tar.gz
下载的文件解压后,主要有几个目录
-
deploy: 包含operator安装、zookeeper安装相关文件和脚本
-
docs:clickhouse集群的各种配置样例
-
cmd, pkg: operator源代码。这里我们先不关注源代码
2、安装operator
安装脚本:
bash
deploy/operator/clickhouse-operator-install.sh
默认会把operator安装到kube-system命名空间下。
bash
# kubectl get po -n kube-system | grep click
clickhouse-operator-994c5bb44-g9t9s 2/2 Running 2 24h
除此之外,脚本还会创建相关的CRD
bash
# kubectl get crd | grep click
clickhouseinstallations.clickhouse.altinity.com 2022-04-19T07:43:25Z
clickhouseinstallationtemplates.clickhouse.altinity.com 2022-04-19T07:43:25Z
clickhouseoperatorconfigurations.clickhouse.altinity.com 2022-04-19T07:43:25Z
CRD:
-
clickhouseinstallations: 描述一个clickhouse安装。
-
clickhouseinstallationtemplates:
-
clickhouseoperatorconfigurations:
3、安装zookeeper
clickhouse的分布式DDL( on cluster)、表复制等功能依赖zookeeper。可以使用外置的zookeeper集群,或者使用k8s集群中的zookeeper。
这里使用了下面这个脚本创建3节点的zookeeper集群:
bash
deploy/zookeeper/quick-start-persistent-volume/zookeeper-3-nodes-create.sh
bash
# kubectl get po -n zoo3ns
NAME READY STATUS RESTARTS AGE
zookeeper-0 1/1 Running 2 23h
zookeeper-1 1/1 Running 1 23h
zookeeper-2 1/1 Running 0 23h
# kubectl get svc -n zoo3ns
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
zookeeper ClusterIP 10.104.149.244 <none> 2181/TCP,7000/TCP 23h
zookeepers ClusterIP None <none> 2888/TCP,3888/TCP 23h
脚本执行成功后,可以看到zookeeper的pod和service。
clickhouse配置文件中,可以使用这里的service来配置zookeeper连接。
4、安装clickhouse集群
docs/chi-examples 目录提供了大量clickhouse集群配置的样列yaml。
我们这里创建一个多分片、多副本、多集群的clickhouse环境,并以此为例来介绍clickhouse的核心概念、配置项。
clickhouse yaml文件:
bash
apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
name: "ck-cluster-x"
spec:
configuration:
users:
user2/networks/ip: "::/0"
user2/password: qwerty
user2/profile: default
zookeeper:
nodes:
- host: 10.104.149.244
clusters:
- name: "ck-cluster-1"
layout:
shardsCount: 2
replicasCount: 2
- name: "ck-cluster-2"
layout:
shardsCount: 2
replicasCount: 2
defaults:
templates:
# Templates are specified as default for all clusters
podTemplate: pod-template-resource-limit
hostTemplate: host-template-custom-ports
templates:
hostTemplates:
- name: host-template-custom-ports
spec:
tcpPort: 7000
httpPort: 7001
interserverHTTPPort: 7002
podTemplates:
- name: pod-template-resource-limit
spec:
containers:
- name: clickhouse
image: clickhouse/clickhouse-server:22.3
volumeMounts:
- name: clickhouse-data-storage
mountPath: /var/lib/clickhouse
# Container has explicitly specified resource limits
resources:
requests:
memory: "1024Mi"
cpu: "500m"
limits:
memory: "1024Mi"
cpu: "500m"
volumeClaimTemplates:
- name: clickhouse-data-storage
spec:
accessModes:
- ReadWriteOnce
# VolumeClaim has explicitly specified resource limits
resources:
requests:
storage: 500Mi
关键配置描述如下:
- Zookeeper配置:
configuration.zookeeper
: 配置Zookeeper。可指定多个节点。这里使用了Service IP: 10.104.149.244,默认端口。
- 用户账号:
users
: 列出用户账号。
- 集群配置:
clusters
: ClickHouse中的集群是一个特殊概念,可配置多个集群。name
: 集群名称。layouts
: 设置集群的分片数和副本数。shardsCount
: 分片数量。replicasCount
: 副本数量。
- 默认设置:
defaults
: 默认配置。templates
: 指定ClickHouse相关模版,如Pod、PVC等。这里设置对应的模版名称,具体模版在templates
中定义。podTemplate
: 可设置Pod的镜像、资源限制等参数。hostTemplate
: 配置ClickHouse的监听端口,一般可使用默认端口。
- 模版定义:
templates
: 定义各种模版。hostTemplates
: 主机模版。podTemplates
: Pod模版,其中memory
设置稍微大一些(1G),以防止可能的OOM。volumeClaimTemplates
: 持久卷声明模版。
创建clickhouse:
bash
kubectl apply -f 02.yaml
创建完成后,可以看到:
clickhouseinstallations:
2个集群,8个节点。完成状态。
bash
# kubectl get clickhouseinstallations
NAME CLUSTERS HOSTS STATUS AGE
ck-cluster-x 2 8 Completed 86m
pod
总共启动了8个pod,
bash
# kubectl get pod
NAME READY STATUS RESTARTS AGE
chi-ck-cluster-x-ck-cluster-1-0-0-0 1/1 Running 0 81m
chi-ck-cluster-x-ck-cluster-1-0-1-0 1/1 Running 0 79m
chi-ck-cluster-x-ck-cluster-1-1-0-0 1/1 Running 0 78m
chi-ck-cluster-x-ck-cluster-1-1-1-0 1/1 Running 0 76m
chi-ck-cluster-x-ck-cluster-2-0-0-0 1/1 Running 0 75m
chi-ck-cluster-x-ck-cluster-2-0-1-0 1/1 Running 0 73m
chi-ck-cluster-x-ck-cluster-2-1-0-0 1/1 Running 0 71m
chi-ck-cluster-x-ck-cluster-2-1-1-0 1/1 Running 0 69m
sts
每个sts对应一个pod。
bash
# kubectl get sts
NAME READY AGE
chi-ck-cluster-x-ck-cluster-1-0-0 1/1 89m
chi-ck-cluster-x-ck-cluster-1-0-1 1/1 88m
chi-ck-cluster-x-ck-cluster-1-1-0 1/1 88m
chi-ck-cluster-x-ck-cluster-1-1-1 1/1 88m
chi-ck-cluster-x-ck-cluster-2-0-0 1/1 87m
chi-ck-cluster-x-ck-cluster-2-0-1 1/1 87m
chi-ck-cluster-x-ck-cluster-2-1-0 1/1 87m
chi-ck-cluster-x-ck-cluster-2-1-1 1/1 86m
services
每个pod对应一个service。
此外还有一个LoadBalancer类型的service。
bash
# kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
chi-ck-cluster-x-ck-cluster-1-0-0 ClusterIP None <none> 7001/TCP,7000/TCP,7002/TCP 90m
chi-ck-cluster-x-ck-cluster-1-0-1 ClusterIP None <none> 7001/TCP,7000/TCP,7002/TCP 90m
chi-ck-cluster-x-ck-cluster-1-1-0 ClusterIP None <none> 7001/TCP,7000/TCP,7002/TCP 90m
chi-ck-cluster-x-ck-cluster-1-1-1 ClusterIP None <none> 7001/TCP,7000/TCP,7002/TCP 89m
chi-ck-cluster-x-ck-cluster-2-0-0 ClusterIP None <none> 7001/TCP,7000/TCP,7002/TCP 89m
chi-ck-cluster-x-ck-cluster-2-0-1 ClusterIP None <none> 7001/TCP,7000/TCP,7002/TCP 89m
chi-ck-cluster-x-ck-cluster-2-1-0 ClusterIP None <none> 7001/TCP,7000/TCP,7002/TCP 88m
chi-ck-cluster-x-ck-cluster-2-1-1 ClusterIP None <none> 7001/TCP,7000/TCP,7002/TCP 88m
clickhouse-ck-cluster-x LoadBalancer 10.100.145.161 <pending>
更多技术信息请查看云掣官网https://yunche.pro/?t=yrgw