ceph的集群管理

0 环境说明
ip地址 主机名 额外硬盘 是否加入ceph集群
10.0.0.141 ceph141 sdb 300G,sdc 500G
10.0.0.142 ceph142 sdb 300G,sdc 500G, sdd 1000G
10.0.0.143 ceph143 sdb 300G,sdc 500G

在上一篇文章中,已经成功地初始化了一个ceph管理节点ceph141。接下来要做的是把ceph142、ceph143节点给添加到集群。

在新的主机加入集群前,还不能使用,因为资源列表是空的,

bash 复制代码
[root@ceph141~]# ceph orch device ls
HOST     PATH      TYPE  DEVICE ID                                   SIZE  AVAILABLE  REFRESHED  REJECT REASONS  
ceph141  /dev/sdb  hdd   ATA_VMware_Virtual_S_01000000000000000001   300G  Yes        6m ago                     
ceph141  /dev/sdc  hdd   ATA_VMware_Virtual_S_02000000000000000001   500G  Yes        6m ago                     
[root@ceph141~]# ceph orch host ls
HOST     ADDR        LABELS  STATUS  
ceph141  10.0.0.141  _admin          
1 hosts in cluster

查看ceph状态,目前只有这一个刚刚初始化的节点

1 ceph节点的添加和删除

orchorchestrator 的缩写,表示编排器(Orchestrator)

1.查看集群主机只有1个,接下来把其他节点添加进去。添加操作也可以在web页面上进行

bash 复制代码
[root@ceph141~]# ceph orch host ls
HOST     ADDR        LABELS  STATUS  
ceph141  10.0.0.141  _admin          
1 hosts in cluster

2.把ssh秘钥放到其他服务器上,方便免密登录

bash 复制代码
[root@ceph141 ~]# ssh-copy-id -f -i /etc/ceph/ceph.pub ceph142
[root@ceph141 ~]# ssh-copy-id -f -i /etc/ceph/ceph.pub ceph143

3.添加其他节点到集群

bash 复制代码
[root@ceph141~]# ceph orch host add ceph142 10.0.0.142
Added host 'ceph142' with addr '10.0.0.142'
[root@ceph141~]# ceph orch host add ceph143 10.0.0.143
Added host 'ceph143' with addr '10.0.0.143'

4.再次查看主机列表,3个节点已经都在列表中了

bash 复制代码
[root@ceph141~]# ceph orch host ls
HOST     ADDR        LABELS  STATUS  
ceph141  10.0.0.141  _admin          
ceph142  10.0.0.142                  
ceph143  10.0.0.143                  
3 hosts in cluster

5.测试删除一个节点,删除后记得添加回来

bash 复制代码
[root@ceph141~]# ceph orch host rm ceph143
Error EINVAL: Not allowed to remove ceph143 from cluster. The following daemons are running in the host:
type                 id             
-------------------- ---------------
ceph-exporter        ceph143        
crash                ceph143        
node-exporter        ceph143        
mon                  ceph143        

Please run 'ceph orch host drain ceph143' to remove daemons from host
2 添加OSD设备到集群

如果一个OSD(对象存储设备)想要加入ceph集群,要求满足2个条件

  • 设备未被使用,已经分区使用的磁盘无法加入
  • 设备的存储大小必须大于5GB

1.添加OSD之前环境查看当前OSD设备

bash 复制代码
[root@ceph141~]# ceph orch device ls
HOST     PATH      TYPE  DEVICE ID                                   SIZE  AVAILABLE  REFRESHED  REJECT REASONS  
ceph141  /dev/sdb  hdd   ATA_VMware_Virtual_S_01000000000000000001   300G  Yes        45s ago                    
ceph141  /dev/sdc  hdd   ATA_VMware_Virtual_S_02000000000000000001   500G  Yes        45s ago                    
ceph142  /dev/sdb  hdd   ATA_VMware_Virtual_S_01000000000000000001   300G  Yes        7m ago                     
ceph142  /dev/sdc  hdd   ATA_VMware_Virtual_S_02000000000000000001   500G  Yes        7m ago                     
ceph142  /dev/sdd  hdd   ATA_VMware_Virtual_S_03000000000000000001  1024G  Yes        7m ago                     
ceph143  /dev/sdb  hdd   ATA_VMware_Virtual_S_01000000000000000001   300G  Yes        6m ago                     
ceph143  /dev/sdc  hdd   ATA_VMware_Virtual_S_02000000000000000001   500G  Yes        6m ago
[root@ceph141~]# ceph osd tree
ID  CLASS  WEIGHT  TYPE NAME     STATUS  REWEIGHT  PRI-AFF
-1              0  root default

2.把上述设备进行添加。磁盘命名根据实际情况添加

daemon 表示以守护进程添加

bash 复制代码
[root@ceph141 ~]# ceph orch daemon add osd ceph141:/dev/sdb
[root@ceph141 ~]# ceph orch daemon add osd ceph141:/dev/sdc
[root@ceph141 ~]# ceph orch daemon add osd ceph142:/dev/sdb
[root@ceph141 ~]# ceph orch daemon add osd ceph142:/dev/sdc
[root@ceph141 ~]# ceph orch daemon add osd ceph142:/dev/sdd
[root@ceph141 ~]# ceph orch daemon add osd ceph143:/dev/sdb
[root@ceph141 ~]# ceph orch daemon add osd ceph143:/dev/sdc

3.查看添加好的OSD以及LVM

bash 复制代码
[root@ceph141~]# ceph orch device ls
Has a FileSystem, Insufficient space (<10 extents) on vgs, LVM detected
Has a FileSystem, Insufficient space (<10 extents) on vgs, LVM detected

[root@ceph141~]# lsblk | egrep -A 1 '^sdb|^sdc'
sdb                                                                                                     8:16   0  300G  0 disk 
└─ceph--ae423c15--8ff6--4e72--af97--ac909ca57fac-osd--block--6fc95050--e75e--4833--845b--b0e7698e1543 253:0    0  300G  0 lvm  
sdc                                                                                                     8:32   0  500G  0 disk 
└─ceph--00a6dda3--90da--46ba--8962--31835455173a-osd--block--2f6e56ff--9926--4c4e--8931--a780eb353431 253:1    0  500G  0 lvm

由此可见,ceph底层是基于LVM技术的

4.再次查看ceph集群状态。命令cephs tatus等于ceph -s

bash 复制代码
[root@ceph141~]# ceph status 
  cluster:
    id:     12fad866-9aa0-11ef-8656-6516a17ad6dd
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum ceph141,ceph142,ceph143 (age 10m)
    mgr: ceph141.yvswvf(active, since 10m), standbys: ceph142.gtcikx
    osd: 7 osds: 7 up (since 7m), 7 in (since 7m)
 
  data:
    pools:   1 pools, 1 pgs
    objects: 2 objects, 577 KiB
    usage:   192 MiB used, 3.3 TiB / 3.3 TiB avail
    pgs:     1 active+clean

5.全部添加后,查看OSD设备的数量为7个。正好对应上

bash 复制代码
[root@ceph141~]# ceph osd ls
0
1
2
3
4
5
6
3 ceph集群配置时间同步

ceph集群基于chrony进行同步时间,如果集群时间偏差较大,可能导致healthy异常

1.所有节点安装chrony时间同步

bash 复制代码
apt -y install chrony 

2.ceph141修改配置,这个节点设置为时间同步服务器。

在配置文件中添加一行:pool ntp.aliyun.com iburst maxsources 4

最后重启服务:systemctl restart chronyd

bash 复制代码
[root@ceph141~]# yy /etc/chrony/chrony.conf
confdir /etc/chrony/conf.d
pool ntp.ubuntu.com        iburst maxsources 4
pool 0.ubuntu.pool.ntp.org iburst maxsources 1
pool 1.ubuntu.pool.ntp.org iburst maxsources 1
pool 2.ubuntu.pool.ntp.org iburst maxsources 2
pool ntp.aliyun.com iburst maxsources 4
sourcedir /run/chrony-dhcp
sourcedir /etc/chrony/sources.d
keyfile /etc/chrony/chrony.keys
driftfile /var/lib/chrony/chrony.drift
ntsdumpdir /var/lib/chrony
logdir /var/log/chrony
maxupdateskew 100.0
rtcsync
makestep 1 3
leapsectz right/UTC

验证服务可用:

bash 复制代码
[root@ceph141~]# chronyc activity -v
200 OK
7 sources online
0 sources offline
2 sources doing burst (return to online)
0 sources doing burst (return to offline)
7 sources with unknown address
4 ceph的管理节点设置

1.拷贝apt源及认证文件到ceph142节点

bash 复制代码
scp /etc/apt/sources.list.d/ceph.list ceph142:/etc/apt/sources.list.d/
scp /etc/apt/trusted.gpg.d/ceph.release.gpg  ceph142:/etc/apt/trusted.gpg.d/

2.在ceph142节点执行安装

bash 复制代码
[root@ceph142~]# apt update
[root@ceph142~]# apt -y install ceph-common
[root@ceph142~]# ceph -v
ceph version 18.2.4 (e7ad5345525c7aa95470c26863873b581076945d) reef (stable)

3.ceph141节点拷贝认证文件到ceph142节点

bash 复制代码
[root@ceph141~]# scp /etc/ceph/ceph.{conf,client.admin.keyring} ceph142:/etc/ceph/

4.在ceph142节点测试可以正常访问ceph集群

bash 复制代码
[root@ceph142~]# ceph -s
  cluster:
    id:     12fad866-9aa0-11ef-8656-6516a17ad6dd
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum ceph141,ceph142,ceph143 (age 15m)
    mgr: ceph141.yvswvf(active, since 14m), standbys: ceph142.gtcikx
    osd: 7 osds: 7 up (since 14m), 7 in (since 35m)
 
  data:
    pools:   1 pools, 1 pgs
    objects: 2 objects, 577 KiB
    usage:   1003 MiB used, 3.3 TiB / 3.3 TiB avail
    pgs:     1 active+clean
5 节点的标签管理

为节点打上对应的标签,以便于日后管理。

参考链接

1.给ceph142节点添加自定义标签

bash 复制代码
[root@ceph141 ~]# ceph orch host label add ceph142 _admin
Added label _admin to host ceph142
[root@ceph141 ~]# ceph orch host label add ceph143 _admin
Added label _admin to host ceph143
[root@ceph141 ~]# ceph orch host label add ceph143 wzy666
Added label wzy666 to host ceph143

2.移除标签

bash 复制代码
[root@ceph141 ~]# ceph orch host label rm ceph143 wzy666
Removed label wzy666 from host ceph143

[root@ceph141 ~]# ceph orch host label rm ceph143 admin
Host ceph143 does not have label 'admin'. Please use 'ceph orch host ls' to list all the labels.

[root@ceph141 ~]# ceph orch host label rm ceph143 _admin
Removed label _admin from host ceph143
Removed label wzy666 from host ceph143

[root@ceph141 ~]# ceph orch host label rm ceph143 admin
Host ceph143 does not have label 'admin'. Please use 'ceph orch host ls' to list all the labels.

[root@ceph141 ~]# ceph orch host label rm ceph143 _admin
Removed label _admin from host ceph143
相关推荐
来自于狂人6 天前
Ceph与RAID在存储中的协同工作过程
ceph
聚集的流星6 天前
rook-ceph云原生存储解决方案
ceph·云原生
非凡的世界10 天前
如何使用 PHP 操作亚马逊 S3 对象云存储
开发语言·php·对象存储·亚马逊云
时空无限11 天前
ceph osd df 输出详解
ceph
时空无限11 天前
ceph fs status 输出详解
ceph
牛马程序员‍11 天前
Day97 minio
java·对象存储·minio·oss
high201113 天前
【对象存储】-- s3:\\、s3n:\\、s3a:\\ 简介
java·ide·eclipse·对象存储·amazon s3
gs8014015 天前
JuiceFS 详解:一款为云原生设计的高性能分布式文件系统
机器学习·云原生·对象存储·大数据分析·分布式文件系统·juicefs·高性能存储
2401_8789617215 天前
ceph集群配置
ceph
2401_8504108315 天前
ceph文件系统
ceph