一、存储概念解析

分布式存储

概念

是一种独特的系统架构
由一组能够通过网络连通，为了完成共同任务而协调任务的计算机节点组成
分布式是为了使用廉价的普通的计算机完成复杂的计算和存储任务
目的就是利用更多的机器处理更多的数据或任务

特性

可扩展：分布式存储系统可以扩展到几百台甚至几千台的集群规模，而且随着集群规模的增长，系统整体性能表现为线性增长
低成本：分布式存储系统的自动容错、自动负载均衡机制使其可以构建在普通的PC机之上。另外，线性扩展能力也使得增加、减少机器非常方便，可以实现自动运维
高性能：无论是针对整个集群还是单台服务器，都要求分布式存储系统具备高性能
易用：分布式存储系统需要能够提供易用的对外接口，另外，也要求具备完善的监控、运维工具，并与其他系统集成
分布式算法
- 哈希分布
- 顺序分布
常用分布式存储方案
- Lustre
- Hadoop
- FastDFS
- GlusterFS
- Ceph

二、Ceph概述

Ceph概念

什么是Ceph

Ceph是一个分布式存储系统，具有高扩展、高可用、高性能等特点
Ceph可以提供块存储、文件存储、对象存储
Ceph支持EB级别的存储空间
作为软件定义存储(Software Define Storage)的优秀解决方案在行业中已得到广泛应用

Ceph架构图

Ceph组件及协同工作

核心组件

监视器：MON(Monitor)
- Monitor负责管理Ceph集群的整体状态、配置信息和监控数据
- 维护集群状态图和管理守护程序和客户端之间的身份验证
- 它们定期选举一个Leader来协调集群中的其他节点，并接收和处理客户端和OSD的请求
- 为了冗余和高可用性，通常至少需要三台Monitor
管理器：MGR(Manager)
- Manager提供集群管理功能，包括集群状态监控、元数据管理、REST API接口等
- 托管基于python的模块来管理和公开Ceph集群信息，包括基于web的Ceph仪表板和REST API
- 以便管理员和用户可视化地管理和操作Ceph集群
- 高可用性通常需要至少两台Manager
OSD（Object Storage Daemon）
- OSD是Ceph存储集群的核心组件
- 负责存储数据和处理数据的复制、恢复和再平衡
- 通过检查其他Ceph OSD守护进程的心跳来为Ceph监视器和管理器提供一些监视信息
- 每个OSD节点都有一个或多个OSD进程来管理对应的存储设备
- 为了实现冗余和高可用性，通常至少需要三个Ceph OSD
MDS（Metadata Server）
- MDS用于支持Ceph文件系统 (CephFS)
- 负责维护文件系统的元数据
- 回答客户端的访问请求，负责文件名到inode的映射，以及跟踪文件锁
RGW（RADOS Gateway）
- RGW是Ceph提供的对象存储网关，兼容S3和Swift协议
- 它为用户提供了通过RESTful API与Ceph存储集群进行交互的能力

辅助工具

Rados
- RADOS（可靠、自适应分布式对象存储）是底层的分布式对象存储系统
- 作为Ceph存储引擎的一部分，提供高性能、可扩展的对象存储服务
CephFS
- CephFS是Ceph的分布式文件系统
- 通过将文件存储在RADOS中实现了文件级别的访问
Librados
- librados是Ceph提供的客户端库，允许开发人员编写基于Ceph的应用程序

Ceph工作图

Ceph数据存储

名词解释
- Object：对象
  - Ceph最底层的存储单元
  - 每个Object包含元数据和数据
- Pool：存储池
  - 是存储对象的逻辑区分
  - 规定了数据冗余的类型和对应的副本分布策略
  - 支持两种类型：副本和纠删码，目前基本上使用的都是3副本类型
- PG（Placement Groups）：数据放置组
  - 是一个逻辑概念
  - 引入这一层是为了更好的分配和定位数据
- CRUSH：算法
  - 是Ceph使用的数据分布算法
  - 确保数据分配到预期的地方
  - 是容灾级别的控制策略
  - 支持Ceph存储集群动态扩展、重新平衡和恢复

三、ceph搭建

实验环境准备

关闭防火墙和SELinux

主机名	IP地址	角色	内存/硬盘
pubserver（已存在）	eth0：192.168.88.240	ansible主机	无须更改
client（已存在）	eth0：192.168.88.10	客户端	无须更改
ceph1	eth0：192.168.88.11	ceph集群	4G / 额外加3块20G硬盘
ceph2	eth0：192.168.88.12	ceph集群	4G / 额外加3块20G硬盘
ceph3	eth0：192.168.88.13	ceph集群	4G / 额外加3块20G硬盘

Ansible配置

配置Ansible

编写Ansible相关配置

root@pubserver \~\]# mkdir ceph \[root@pubserver \~\]# cd ceph/ **\[root@pubserver ceph\]# vim ansible.cfg** > \[defaults

inventory = inventory

module_name = shell

host_key_checking = false

roles_path = roles #将当前目录下的 roles 文件夹设置为 Ansible 查找角色的路径。

root@pubserver ceph\]# mkdir roles \[root@pubserver ceph\]# vim inventory > \[ceph

ceph1 ansible_ssh_host=192.168.88.11

ceph2 ansible_ssh_host=192.168.88.12

ceph3 ansible_ssh_host=192.168.88.13
$clients$
client ansible_ssh_host=192.168.88.10
$all:vars$
ansible_ssh_user=root #登录用户

ansible_ssh_pass=a #自定义用户密码

# 验证Ansible配置，确认CEPH节点内存和硬盘信息

root@pubserver ceph\]# ansible all -m ping \[root@pubserver ceph\]# ansible ceph -a "free -h" ceph3 \| CHANGED \| rc=0 \>\> total used free shared buff/cache available Mem: 3.7Gi 125Mi 3.5Gi 16Mi 111Mi 3.4Gi Swap: 0B 0B 0B ceph2 \| CHANGED \| rc=0 \>\> total used free shared buff/cache available Mem: 3.7Gi 125Mi 3.5Gi 16Mi 111Mi 3.4Gi Swap: 0B 0B 0B ceph1 \| CHANGED \| rc=0 \>\> total used free shared buff/cache available Mem: 3.7Gi 125Mi 3.5Gi 16Mi 111Mi 3.4Gi Swap: 0B 0B 0B \[root@pubserver ceph\]# ansible ceph -a "lsblk" ceph1 \| CHANGED \| rc=0 \>\> NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT vda 253:0 0 20G 0 disk └─vda1 253:1 0 20G 0 part / vdb 253:16 0 20G 0 disk vdc 253:32 0 20G 0 disk vdd 253:48 0 20G 0 disk ceph3 \| CHANGED \| rc=0 \>\> NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT vda 253:0 0 20G 0 disk └─vda1 253:1 0 20G 0 part / vdb 253:16 0 20G 0 disk vdc 253:32 0 20G 0 disk vdd 253:48 0 20G 0 disk ceph2 \| CHANGED \| rc=0 \>\> NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT vda 253:0 0 20G 0 disk └─vda1 253:1 0 20G 0 part / vdb 253:16 0 20G 0 disk vdc 253:32 0 20G 0 disk vdd 253:48 0 20G 0 disk ### 基础准备工作 * **更新自定义yum源，加入Ceph相关软件** ## 更新自定义yum源 \[root@server1 \~\]# scp /linux-soft/s2/zzg/ceph_soft/cephclient-rpm/\* root@192.168.88.240:/var/ftp/rpms/ \[root@pubserver \~\]# createrepo --update /var/ftp/rpms/ #更新yum源repodata信息 **## 更新所有节点yum源** \[root@pubserver ceph\]# mkdir files \[root@pubserver ceph\]# vim files/local88.repo > \[BaseOS

name=RockyLinux BaseOS

baseurl="ftp://192.168.88.240/dvd/BaseOS/"

enabled=1

gpgcheck=0
$AppStream$
name=RockyLinux AppStream

baseurl="ftp://192.168.88.240/dvd/AppStream/"

enabled=1

gpgcheck=0
$rpms$
name=local rpms

baseurl="ftp://192.168.88.240/rpms/"

enabled=1

gpgcheck=0

root@pubserver ceph\]# vim 02_update_yum.yml > --- > > - name: update yum > > hosts: all > > tasks: > > - name: remove dir #删除目录 > > file: > > path: /etc/yum.repos.d/ > > state: absent > > - name: create dir #创建目录 > > file: > > path: /etc/yum.repos.d/ > > state: directory > > - name: upload file #发送repo文件 > > copy: > > src: files/local88.repo > > dest: /etc/yum.repos.d/local88.repo \[root@pubserver ceph\]# ansible-playbook 02_update_yum.yml * **配置所有节点主机名解析** ## 添加所有节点主机名解析配置 # blockinfile模块跟lineinfile基本一样，向指定文件内加入一段内容 # 192.168.88.240必须解析为**quay.io**！！！ \[root@pubserver ceph\]# vim 01_update_hosts.yml > --- > > - name: update hosts > > hosts: all > > tasks: > > - name: add host resolv #修改/etc/hosts文件添加主机名映射 > > blockinfile: > > path: /etc/hosts > > block: \| > > 192.168.88.10 client > > 192.168.88.11 ceph1 > > 192.168.88.12 ceph2 > > 192.168.88.13 ceph3 > > 192.168.88.240 quay.io \[root@pubserver ceph\]# ansible-playbook 01_update_hosts.yml \[root@pubserver ceph\]# ansible all -a "tail -7 /etc/hosts" * **配置时间同步服务** ## 配置时间同步服务Chronyd # 配置服务端 \[root@pubserver \~\]# timedatectl #查看系统时间配置 \[root@pubserver \~\]# timedatectl set-timezone Asia/Shanghai #设置时区为上海 \[root@pubserver \~\]# date \[root@pubserver \~\]# date -s "年-月-日 时:分:秒" #如果日期时间不对则修改 \[root@pubserver \~\]# **yum -y install chrony** #已经安装 \[root@pubserver \~\]# **vim /etc/chrony.conf** > ... > > 25 allow 192.168.88.0/24 #允许88网段主机同步时间 > > 26 > > 27 # Serve time even if not synchronized to a time source. > > 28 local stratum 10 #向下10层同步时间 > > ... \[root@pubserver \~\]# **systemctl enable chronyd** #设置服务开机自启动 \[root@pubserver \~\]# **systemctl restart chronyd** #重启chronyd服务 \[root@pubserver \~\]# ss -antlpu \| grep chronyd udp UNCONN 0 0 127.0.0.1:323 0.0.0.0:\* users:(("chronyd",pid=9225,fd=5)) udp UNCONN 0 0 0.0.0.0:123 0.0.0.0:\* users:(("chronyd",pid=9225,fd=6)) \[root@pubserver \~\]# **# 配置客户端(使用系统角色)** \[root@pubserver \~\]# **yum -y install rhel-system-roles** \[root@pubserver ceph\]#**cp -r /usr/share/ansible/roles/rhel-system-roles.timesync/ ./roles/timesync** # 将 RHEL 系统自带的 `timesync` 角色复制到你当前 Ansible 项目 \[root@pubserver ceph\]# ansible-galaxy list #快速查看当前环境中已安装的 Ansible 角色 # /root/ceph/roles - timesync, (unknown version) \[root@pubserver ceph\]# vim 03_timesync.yml > --- > > - name: config ntp #利用timesync角色配置时间服务 > > hosts: all > > vars: > > timesync_ntp_servers: > > - hostname: 192.168.88.240 > > iburst: yes > > roles: > > - timesync \[root@pubserver ceph\]# ansible-playbook 03_timesync.yml \[root@pubserver ceph\]# ansible all -a "chronyc sources" chronyc sources #列出所有配置的时间源（NTP 服务器）及其同步状态 ceph1 \| CHANGED \| rc=0 \>\> MS Name/IP address Stratum Poll Reach LastRx Last sample =============================================================================== \^\* quay.io 4 6 177 10 +12us\[ +20us\] +/- 39ms ceph3 \| CHANGED \| rc=0 \>\> MS Name/IP address Stratum Poll Reach LastRx Last sample =============================================================================== \^\* quay.io 4 6 177 11 -29us\[ -31us\] +/- 38ms ceph2 \| CHANGED \| rc=0 \>\> MS Name/IP address Stratum Poll Reach LastRx Last sample =============================================================================== \^\* quay.io 4 6 177 11 +17us\[ +37us\] +/- 38ms client \| CHANGED \| rc=0 \>\> MS Name/IP address Stratum Poll Reach LastRx Last sample =============================================================================== \^\* quay.io 4 6 177 10 -10us\[ -21us\] +/- 38ms * **Ceph节点安装必要软件** ## Ceph节点安装必要软件 # Ceph-Quincy版本采用容器化方式部署 # 要求Ceph节点有Python3环境，容器管理工具podman或docker，lvm2软件 \[root@pubserver ceph\]# vim 04_inst_pkgs.yml > --- > > - name: install pkgs > > hosts: ceph > > tasks: > > - name: install pkgs #安装必备软件 > > yum: > > name: python39,podman,lvm2 > > state: present \[root@pubserver ceph\]# ansible-playbook 04_inst_pkgs.yml * **搭建私有Ceph镜像仓库** ## 搭建私有容器镜像仓库 # 部署Ceph-Quincy集群需要使用cephadm工具，该工具为一个Python脚本 # 部署过程中需要连接到公网quay.io站点下载Ceph相关镜像 # 为规避无法连接外网或同一时间大量下载造成网络卡顿，故需自己部署一个私有站点quay.io欺骗cephadm工具 # 上传Ceph集群相关文件(cephadm脚本和Ceph镜像) \[root@server1 \~\]# scp -r /linux-soft/s2/zzg/ceph_soft/ceph-server/ root@192.168.88.240:/root # 搭建私有镜像仓库 \[root@pubserver ceph\]# cd /root/**ceph-server**/ \[root@pubserver ceph-server\]# yum -y install docker-distribution-2.6.2-2.git48294d9.el7.x86_64.rpm \[root@pubserver ceph-server\]# vim /etc/docker-distribution/registry/config.yml Docker 私有镜像仓库（Docker Registry）的核心文件。这个 `config.yml` 文件决定了仓库的存储位置、监听端口、访问日志等关键行为。 > version: 0.1 > > log: > > fields: > > service: registry > > storage: > > cache: > > layerinfo: inmemory > > filesystem: > > rootdirectory: /var/lib/registry > > http: > > addr: :80 #端口由5000调整为80，必须调整，否则后续下载镜像时会有报错 \[root@pubserver ceph-server\]# systemctl enable --now docker-distribution.service \[root@pubserver ceph-server\]# ss -antpul \| grep :80 #确认80端口被registry进程占用 tcp LISTEN 0 128 \*:80 \*:\* users:(("registry",pid=11002,fd=3)) \[root@pubserver ceph-server\]# curl http://localhost/v2/_catalog {"repositories":\[\]} #此时仓库为空 # 导入Ceph镜像 \[root@pubserver ceph-server\]# vim /etc/hosts 192.168.88.240 quay.io \[root@pubserver ceph-server\]# **yum -y install podman** \[root@pubserver ceph-server\]# vim /etc/containers/registries.conf #配置私有仓库，文件最后追加 > ... > > \[\[registry\]

location = "quay.io" #私有仓库地址

insecure = true #可以使用http协议

#导入Ceph相关镜像到本地

root@pubserver ceph-server\]# for i in \*.tar do podman load -i $i done \[root@pubserver ceph-server\]# podman images REPOSITORY TAG IMAGE ID CREATED SIZE quay.io/ceph/ceph v17 cc65afd6173a 17 months ago 1.4 GB quay.io/ceph/ceph-grafana 8.3.5 dad864ee21e9 24 months ago 571 MB quay.io/prometheus/prometheus v2.33.4 514e6a882f6e 2 years ago 205 MB quay.io/prometheus/node-exporter v1.3.1 1dbe0e931976 2 years ago 22.3 MB quay.io/prometheus/alertmanager v0.23.0 ba2b418f427c 2 years ago 58.9 MB # 推送镜像到私有仓库 > \[root@pubserver ceph-server\]# podman push quay.io/ceph/ceph:v17 > > \[root@pubserver ceph-server\]# podman push quay.io/ceph/ceph-grafana:8.3.5 > > \[root@pubserver ceph-server\]# podman push quay.io/prometheus/prometheus:v2.33.4 > > \[root@pubserver ceph-server\]# podman push quay.io/prometheus/node-exporter:v1.3.1 > > \[root@pubserver ceph-server\]# podman push quay.io/prometheus/alertmanager:v0.23.0 # 验证私有仓库中Ceph镜像保存情况 \[root@pubserver ceph-server\]# curl http://quay.io/v2/_catalog {"repositories":\["ceph/ceph","ceph/ceph-grafana","prometheus/alertmanager","prometheus/node-exporter","prometheus/prometheus"\]} # 配置Ceph节点使用私有镜像仓库 \[root@pubserver \~\]# cd /root/ceph \[root@pubserver ceph\]# vim 05_config_priv_registry.yml > --- > > - name: config private registry > > hosts: ceph > > tasks: > > - name: add quay.io #配置私有registry仓库 > > blockinfile: > > path: /etc/containers/registries.conf > > block: \| > > \[\[registry\]

location = "quay.io"

insecure = true

root@pubserver ceph\]# ansible-playbook 05_config_priv_registry.yml \[root@pubserver ceph\]# ansible ceph -a 'tail -5 /etc/containers/registries.conf' ceph1 \| CHANGED \| rc=0 \>\> # BEGIN ANSIBLE MANAGED BLOCK \[\[registry\]

location = "quay.io"

insecure = true

END ANSIBLE MANAGED BLOCK

ceph2 | CHANGED | rc=0 >>

BEGIN ANSIBLE MANAGED BLOCK

\[registry\]

location = "quay.io"

insecure = true

END ANSIBLE MANAGED BLOCK

ceph3 | CHANGED | rc=0 >>

BEGIN ANSIBLE MANAGED BLOCK

\[registry\]

location = "quay.io"

insecure = true

END ANSIBLE MANAGED BLOCK

# 建议搭建Ceph集群前临时撤掉Ceph节点网关，禁止Ceph节点连接公网

Ceph - 1

一、存储概念解析

分布式存储

概念

特性

二、Ceph概述

Ceph概念

什么是Ceph

Ceph架构图

Ceph组件及协同工作

核心组件

辅助工具

Ceph工作图

Ceph数据存储

三、ceph搭建

实验环境准备

Ansible配置

配置Ansible

编写Ansible相关配置

END ANSIBLE MANAGED BLOCK

BEGIN ANSIBLE MANAGED BLOCK

END ANSIBLE MANAGED BLOCK

BEGIN ANSIBLE MANAGED BLOCK

END ANSIBLE MANAGED BLOCK