Kubernetes 部署 OpenStack 详细指南
在 Kubernetes 上部署 OpenStack 是一个复杂但功能强大的方案,主要通过 OpenStack-Helm 项目实现。这种部署方式提供了高可用性、可扩展性和容器化的优势。
1. 概述和架构
1.1 OpenStack-Helm 架构
┌─────────────────────────────────────────────────────────────┐
│ Kubernetes Cluster │
├─────────────────────────────────────────────────────────────┤
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Control │ │ Compute │ │ Network │ │
│ │ Plane │ │ Plane │ │ Plane │ │
│ │ │ │ │ │ │ │
│ │ - Keystone │ │ - Nova │ │ - Neutron │ │
│ │ - Glance │ │ - Libvirt │ │ - OVS │ │
│ │ - Cinder │ │ │ │ │ │
│ │ - Horizon │ │ │ │ │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
├─────────────────────────────────────────────────────────────┤
│ Infrastructure Services │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ MariaDB │ │ RabbitMQ │ │ Memcached │ │
│ │ Galera │ │ Cluster │ │ Redis │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
└─────────────────────────────────────────────────────────────┘
2. 环境准备
2.1 硬件要求
yaml
# 最小集群配置
minimum_requirements:
control_nodes: 3
compute_nodes: 2
storage_nodes: 3
per_node:
cpu: 8 cores
memory: 32GB
storage: 100GB SSD
network: 2x 1Gbps NICs
# 推荐生产配置
production_requirements:
control_nodes: 3
compute_nodes: 5+
storage_nodes: 3+
per_node:
cpu: 16+ cores
memory: 64GB+
storage: 500GB+ NVMe
network: 2x 10Gbps NICs
2.2 网络规划
yaml
# 网络段规划
networks:
management: 10.0.1.0/24 # 管理网络
storage: 10.0.2.0/24 # 存储网络
tenant: 10.0.3.0/24 # 租户网络
external: 192.168.1.0/24 # 外部网络
services:
kubernetes_service: 10.96.0.0/12
kubernetes_pod: 10.244.0.0/16
3. Kubernetes 集群搭建
3.1 使用 kubeadm 部署集群
bash
#!/bin/bash
# setup_k8s_cluster.sh
# 1. 准备所有节点
prepare_nodes() {
# 禁用 swap
swapoff -a
sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
# 安装 Docker
curl -fsSL https://get.docker.com | sh
systemctl enable docker
systemctl start docker
# 安装 kubeadm, kubelet, kubectl
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
EOF
yum install -y kubelet kubeadm kubectl
systemctl enable kubelet
}
# 2. 初始化主节点
init_master() {
kubeadm init \
--pod-network-cidr=10.244.0.0/16 \
--service-cidr=10.96.0.0/12 \
--apiserver-advertise-address=10.0.1.10
# 配置 kubectl
mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config
}
# 3. 安装网络插件 (Calico)
install_network() {
kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml
}
# 4. 加入工作节点
join_workers() {
# 在工作节点执行 kubeadm join 命令
# 从主节点获取 join token
kubeadm token create --print-join-command
}
3.2 配置存储类
yaml
# storageclass.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: ceph-rbd
annotations:
storageclass.kubernetes.io/is-default-class: "true"
provisioner: rbd.csi.ceph.com
parameters:
clusterID: b9127830-b0cc-4e34-aa47-9d1a2e9949a8
pool: kube
imageFeatures: layering
reclaimPolicy: Delete
volumeBindingMode: Immediate
4. 部署基础设施服务
4.1 部署 Helm
bash
# 安装 Helm 3
curl https://get.helm.sh/helm-v3.12.0-linux-amd64.tar.gz | tar -xzO linux-amd64/helm > /usr/local/bin/helm
chmod +x /usr/local/bin/helm
# 添加 OpenStack-Helm 仓库
helm repo add openstack-helm https://tarballs.openstack.org/openstack-helm/
helm repo add openstack-helm-infra https://tarballs.openstack.org/openstack-helm-infra/
helm repo update
4.2 创建命名空间
yaml
# namespaces.yaml
apiVersion: v1
kind: Namespace
metadata:
name: openstack
labels:
name: openstack
---
apiVersion: v1
kind: Namespace
metadata:
name: ceph
labels:
name: ceph
---
apiVersion: v1
kind: Namespace
metadata:
name: openstack-infra
labels:
name: openstack-infra
4.3 部署 MariaDB 集群
bash
# 创建 MariaDB 配置
cat > mariadb-values.yaml << 'EOF'
pod:
replicas:
server: 3
conf:
database:
max_connections: 1000
storage:
mysql:
requests:
storage: 50Gi
storageclass: local-storage
endpoints:
oslo_db:
hosts:
default: mariadb-galera
host_fqdn_override:
default: null
path: null
scheme: mysql+pymysql
port:
mysql:
default: 3306
EOF
# 部署 MariaDB
helm upgrade --install mariadb openstack-helm-infra/mariadb \
--namespace=openstack-infra \
--values=mariadb-values.yaml \
--wait
4.4 部署 RabbitMQ 集群
bash
# 创建 RabbitMQ 配置
cat > rabbitmq-values.yaml << 'EOF'
pod:
replicas:
server: 3
storage:
requests:
storage: 10Gi
storageclass: local-storage
conf:
rabbitmq:
cluster_formation:
peer_discovery_backend: rabbit_peer_discovery_k8s
k8s:
host: kubernetes.default.svc.cluster.local
EOF
# 部署 RabbitMQ
helm upgrade --install rabbitmq openstack-helm-infra/rabbitmq \
--namespace=openstack-infra \
--values=rabbitmq-values.yaml \
--wait
4.5 部署 Memcached
bash
# 部署 Memcached
helm upgrade --install memcached openstack-helm-infra/memcached \
--namespace=openstack-infra \
--wait
5. 部署 OpenStack 核心服务
5.1 部署 Keystone (身份认证服务)
bash
# 创建 Keystone 配置
cat > keystone-values.yaml << 'EOF'
endpoints:
identity:
host_fqdn_override:
default: keystone.openstack.example.com
scheme:
default: http
port:
api:
default: 80
oslo_db:
hosts:
default: mariadb-galera
host_fqdn_override:
default: null
path: /keystone
scheme: mysql+pymysql
port:
mysql:
default: 3306
bootstrap:
enabled: true
conf:
keystone:
DEFAULT:
max_token_size: 255
software:
apache2:
a2enmod:
- rewrite
a2dismod: []
pod:
replicas:
api: 3
EOF
# 部署 Keystone
helm upgrade --install keystone openstack-helm/keystone \
--namespace=openstack \
--values=keystone-values.yaml \
--wait
5.2 部署 Glance (镜像服务)
bash
# 创建 Glance 配置
cat > glance-values.yaml << 'EOF'
storage: rbd
conf:
glance:
DEFAULT:
show_image_direct_url: true
glance_store:
stores: rbd
default_store: rbd
rbd_store_pool: images
rbd_store_user: glance
rbd_store_ceph_conf: /etc/ceph/ceph.conf
rbd_store_chunk_size: 8
dependencies:
static:
api:
jobs:
- glance-storage-init
- glance-db-sync
- glance-ks-user
- glance-ks-endpoints
ceph_client:
configmap: ceph-etc
user_secret_name: pvc-ceph-client-key
pod:
replicas:
api: 3
registry: 3
EOF
# 部署 Glance
helm upgrade --install glance openstack-helm/glance \
--namespace=openstack \
--values=glance-values.yaml \
--wait
5.3 部署 Nova (计算服务)
bash
# 创建 Nova 配置
cat > nova-values.yaml << 'EOF'
conf:
nova:
DEFAULT:
osapi_compute_workers: 8
metadata_workers: 8
libvirt:
virt_type: kvm
cpu_mode: host-passthrough
vnc:
server_listen: 0.0.0.0
server_proxyclient_address: $my_ip
novncproxy_base_url: http://nova-novncproxy.openstack.svc.cluster.local:6080/vnc_auto.html
network:
backend: neutron
ceph_client:
configmap: ceph-etc
user_secret_name: pvc-ceph-client-key
pod:
replicas:
api_metadata: 3
osapi: 3
conductor: 3
scheduler: 3
novncproxy: 3
labels:
agent:
compute:
node_selector_key: openstack-compute-node
node_selector_value: enabled
EOF
# 标记计算节点
kubectl label nodes <compute-node-1> openstack-compute-node=enabled
kubectl label nodes <compute-node-2> openstack-compute-node=enabled
# 部署 Nova
helm upgrade --install nova openstack-helm/nova \
--namespace=openstack \
--values=nova-values.yaml \
--wait
5.4 部署 Neutron (网络服务)
bash
# 创建 Neutron 配置
cat > neutron-values.yaml << 'EOF'
network:
backend: openvswitch
interface:
tunnel: docker0
conf:
neutron:
DEFAULT:
l3_ha: true
max_l3_agents_per_router: 3
l3_ha_net_cidr: 169.254.192.0/18
neutron_sudoers: |
neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf *
plugins:
ml2_conf:
ml2:
type_drivers: flat,vlan,vxlan
tenant_network_types: vxlan
mechanism_drivers: openvswitch,l2population
ml2_type_vxlan:
vni_ranges: 1:1000
labels:
agent:
dhcp:
node_selector_key: openstack-network-node
node_selector_value: enabled
l3:
node_selector_key: openstack-network-node
node_selector_value: enabled
metadata:
node_selector_key: openstack-network-node
node_selector_value: enabled
pod:
replicas:
server: 3
EOF
# 标记网络节点
kubectl label nodes <network-node-1> openstack-network-node=enabled
# 部署 Neutron
helm upgrade --install neutron openstack-helm/neutron \
--namespace=openstack \
--values=neutron-values.yaml \
--wait
5.5 部署 Horizon (Web UI)
bash
# 创建 Horizon 配置
cat > horizon-values.yaml << 'EOF'
network:
node_port:
enabled: true
port: 31000
conf:
horizon:
local_settings:
config:
openstack_keystone_url: "http://keystone-api.openstack.svc.cluster.local:5000/v3"
openstack_keystone_default_role: "_member_"
pod:
replicas:
server: 3
EOF
# 部署 Horizon
helm upgrade --install horizon openstack-helm/horizon \
--namespace=openstack \
--values=horizon-values.yaml \
--wait
6. 存储服务部署
6.1 部署 Ceph 集群
bash
# 使用 Rook-Ceph 部署
kubectl create -f https://raw.githubusercontent.com/rook/rook/master/deploy/examples/crds.yaml
kubectl create -f https://raw.githubusercontent.com/rook/rook/master/deploy/examples/common.yaml
kubectl create -f https://raw.githubusercontent.com/rook/rook/master/deploy/examples/operator.yaml
# 创建 Ceph 集群配置
cat > ceph-cluster.yaml << 'EOF'
apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
name: rook-ceph
namespace: rook-ceph
spec:
cephVersion:
image: quay.io/ceph/ceph:v17.2.6
dataDirHostPath: /var/lib/rook
skipUpgradeChecks: false
continueUpgradeAfterChecksEvenIfNotHealthy: false
mon:
count: 3
allowMultiplePerNode: false
mgr:
count: 2
dashboard:
enabled: true
storage:
useAllNodes: true
useAllDevices: true
EOF
kubectl create -f ceph-cluster.yaml
6.2 创建存储池
yaml
# ceph-pools.yaml
apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
name: rbd-pool
namespace: rook-ceph
spec:
failureDomain: host
replicated:
size: 3
---
apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
name: images-pool
namespace: rook-ceph
spec:
failureDomain: host
replicated:
size: 3
7. 配置网络
7.1 配置 Open vSwitch
bash
# 在所有节点安装 OVS
yum install -y openvswitch
systemctl enable openvswitch
systemctl start openvswitch
# 创建网桥
ovs-vsctl add-br br-ex
ovs-vsctl add-port br-ex <external-interface>
# 配置网络
cat > /etc/sysconfig/network-scripts/ifcfg-br-ex << 'EOF'
DEVICE=br-ex
BOOTPROTO=static
IPADDR=192.168.1.10
NETMASK=255.255.255.0
GATEWAY=192.168.1.1
ONBOOT=yes
TYPE=OVSBridge
EOF
7.2 配置 Ingress Controller
yaml
# ingress-controller.yaml
apiVersion: v1
kind: Service
metadata:
name: openstack-ingress
namespace: openstack
spec:
type: LoadBalancer
ports:
- port: 80
targetPort: 80
name: http
- port: 443
targetPort: 443
name: https
selector:
app: nginx-ingress-controller
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: openstack-apis
namespace: openstack
spec:
rules:
- host: keystone.openstack.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: keystone-api
port:
number: 5000
- host: nova.openstack.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: nova-api
port:
number: 8774
8. 监控和日志
8.1 部署 Prometheus 监控
bash
# 添加 Prometheus Helm 仓库
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
# 部署 Prometheus
helm upgrade --install prometheus prometheus-community/kube-prometheus-stack \
--namespace monitoring \
--create-namespace \
--set grafana.enabled=true \
--set alertmanager.enabled=true
8.2 配置日志收集
yaml
# fluentd-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: fluentd-config
namespace: kube-system
data:
fluent.conf: |
<source>
@type tail
path /var/log/containers/*openstack*.log
pos_file /var/log/fluentd-openstack.log.pos
tag kubernetes.openstack.*
format json
time_key time
time_format %Y-%m-%dT%H:%M:%S.%NZ
</source>
<match kubernetes.openstack.**>
@type elasticsearch
host elasticsearch.logging.svc.cluster.local
port 9200
index_name openstack-logs
</match>
9. 验证和测试
9.1 验证服务状态
bash
# 检查所有 Pod 状态
kubectl get pods -n openstack
kubectl get pods -n openstack-infra
# 检查服务端点
kubectl get endpoints -n openstack
# 检查 OpenStack 服务
export OS_AUTH_URL=http://keystone.openstack.example.com:5000/v3
export OS_PROJECT_DOMAIN_NAME=Default
export OS_USER_DOMAIN_NAME=Default
export OS_PROJECT_NAME=admin
export OS_USERNAME=admin
export OS_PASSWORD=password
export OS_REGION_NAME=RegionOne
export OS_INTERFACE=public
export OS_IDENTITY_API_VERSION=3
openstack service list
openstack compute service list
openstack network agent list
9.2 创建测试实例
bash
# 创建网络
openstack network create private-net
openstack subnet create --network private-net --subnet-range 10.0.0.0/24 private-subnet
# 创建外部网络
openstack network create --external --provider-physical-network physnet1 \
--provider-network-type flat public-net
openstack subnet create --network public-net --subnet-range 192.168.1.0/24 \
--gateway 192.168.1.1 --allocation-pool start=192.168.1.100,end=192.168.1.200 \
--no-dhcp public-subnet
# 创建路由器
openstack router create router1
openstack router set --external-gateway public-net router1
openstack router add subnet router1 private-subnet
# 创建实例
openstack server create --flavor m1.small --image cirros --network private-net test-vm
10. 故障排除和运维
10.1 常见问题排查
bash
# 查看 Pod 日志
kubectl logs -n openstack <pod-name>
# 查看 Pod 描述
kubectl describe pod -n openstack <pod-name>
# 查看服务状态
kubectl get svc -n openstack
# 检查存储
kubectl get pv,pvc -n openstack
# 查看事件
kubectl get events -n openstack --sort-by='.lastTimestamp'
10.2 备份和恢复
bash
# 备份 etcd
ETCDCTL_API=3 etcdctl snapshot save backup.db \
--endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt \
--key=/etc/kubernetes/pki/etcd/healthcheck-client.key
# 备份数据库
kubectl exec -n openstack-infra mariadb-server-0 -- \
mysqldump --all-databases > openstack-db-backup.sql
# 备份配置
kubectl get configmaps -n openstack -o yaml > openstack-configs.yaml
kubectl get secrets -n openstack -o yaml > openstack-secrets.yaml
11. 扩展和升级
11.1 扩展计算节点
bash
# 添加新的计算节点标签
kubectl label nodes <new-compute-node> openstack-compute-node=enabled
# 重新部署 Nova 以包含新节点
helm upgrade nova openstack-helm/nova \
--namespace=openstack \
--values=nova-values.yaml \
--reuse-values
11.2 升级 OpenStack 版本
bash
# 更新 Helm 仓库
helm repo update
# 逐个升级服务
helm upgrade keystone openstack-helm/keystone \
--namespace=openstack \
--values=keystone-values.yaml
helm upgrade glance openstack-helm/glance \
--namespace=openstack \
--values=glance-values.yaml
总结
在 Kubernetes 上部署 OpenStack 提供了以下优势:
优势:
- 高可用性:Pod 自动重启和调度
- 弹性扩展:根据需求自动扩缩容
- 资源效率:容器化带来的资源利用率提升
- 标准化部署:Helm Chart 提供的标准化部署方式
- 简化运维:Kubernetes 原生的监控和日志功能
注意事项:
- 复杂性:部署和维护相对复杂
- 资源消耗:Kubernetes 和 OpenStack 双重开销
- 网络配置:需要仔细规划网络架构
- 存储要求:需要高性能的分布式存储
- 故障排查:需要同时了解 K8s 和 OpenStack
这种部署方式适合大规模、高可用的云平台部署场景