目录
[Set /Drive 的关系](#Set /Drive 的关系)
原理
MinIO是一个S3兼容的高性能对象存储,其主要特点如下:
-
适合存储大容量非结构化的数据,如图片,视频,日志文件;
-
一个对象文件可以任意大小,从几 KB 到最大的 5T 不等;
-
轻量,高效;MinIO 默认不计算 MD5 ,除非传输给客户端的时候,所以很快;
-
支持 windows;
-
有 web 页进行管理,命令行和控制台双重管理;
-
分布式集群支持动态升级;
-
使用纠删码实现数据冗余:Minio采用Reed-Solomon code将对象拆分成N/2数据和N/2 奇偶校验块。 这就意味着如果是12块盘,一个对象会被分成6个数据块、6个奇偶校验块,你可以丢失任意6块盘(不管其是存放的数据块还是奇偶校验块),你仍可以从剩下的盘中的数据进行恢复
MinIO的两大组件:
-
MinIO Server:服务端,提供对象存储服务
-
MinIO Client:命令行客户端,命令关键字为mc,可通过ls,cat,find等类unix指令操作Server端存储的对象
概念
名词解释
MINIO 有几个概念比较重要:
-
Object:存储到 Minio 的基本对象,如文件、图片、视频.......
-
Bucket:用来存储 Object 的逻辑空间。每个 Bucket 之间的数据是相互隔离的。对于客户端而言,就相当于一个存放文件的顶层文件夹。
-
Drive:即存储数据的磁盘,在 MinIO 启动时,以参数的方式传入。Minio 中所有的对象数据都会存储在 Drive 里。
-
Set
即一组 Drive 的集合,分布式的minio根据集群规模自动划分一个或多个 Set ,每个 Set 中的 Drive 分布在不同位置:
-
一个对象存储在一个Set上
-
一个集群划分为多个Set
-
一个Set包含的Drive数量是固定的,默认由系统根据集群规模自动计算得出
-
一个SET中的Drive尽可能分布在不同的节点上
-
Set /Drive 的关系
Set /Drive 这两个概念是 MINIO 里面最重要的两个概念,一个对象最终是存储在 Set 上面的。
下图为 MINIO 集群存储示意图,每一行是一个节点机器,有 32 个节点,每个节点里有一个小方块我们称之 Drive,Drive 可以简单地理解为一个硬盘。一个节点有 32 个 Drive,相当于 32 块硬盘。
Set 是另外一个概念,Set 是一组 Drive 的集合,图中,所有蓝色、橙色背景的Drive(硬盘)的就组成了一个 Set.
MinIO部署
MinIO的两种部署模式:
-
standalone:单节点minio
-
distributed:分布式的minio集群,需要说明的是,一个分布式的minio至少需要四块磁盘
单机Minio服务存在单点故障,如果是一个有N块硬盘的分布式Minio,只要有N/2硬盘在线,数据就不会丢失。但是要想实现数据的写入,至少需要N/2+1个硬盘。比如一个16节点的Minio集群,每个节点16块硬盘,就算8台服务器宕机,这个集群仍然是可读的,不过需要9台服务器才能写数据。
只要遵守分布式Minio的限制,可以组合不同的节点和每个节点几块硬盘。比如,可以使用2个节点,每个节点4块硬盘,也可以使用4个节点,每个节点两块硬盘,诸如此类
单机
单机单盘
挂载磁盘
root@master:~# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
loop0 7:0 0 63.9M 1 loop /snap/core20/2105
loop1 7:1 0 64M 1 loop /snap/core20/2379
loop2 7:2 0 87M 1 loop /snap/lxd/27037
loop3 7:3 0 87M 1 loop /snap/lxd/29351
loop4 7:4 0 40.4M 1 loop /snap/snapd/20671
loop5 7:5 0 38.8M 1 loop /snap/snapd/21759
sda 8:0 0 50G 0 disk
├─sda1 8:1 0 1M 0 part
├─sda2 8:2 0 2G 0 part /boot
└─sda3 8:3 0 48G 0 part
└─ubuntu--vg-ubuntu--lv 253:0 0 24G 0 lvm /
sdb 8:16 0 20G 0 disk
sr0 11:0 1 2G 0 rom
root@master:~# mkfs.ext4 /dev/sdb
mke2fs 1.46.5 (30-Dec-2021)
Creating filesystem with 5242880 4k blocks and 1310720 inodes
Filesystem UUID: e8bb820f-2bf9-456c-8871-c50fbc73fbd3
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
4096000
Allocating group tables: done
Writing inode tables: done
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done
root@master:~# mkdir /minio1
root@master:~# vim /etc/fstab
/dev/sdb /minio1 ext4 defaults 0 0
mount -aa
root@master:~# df -Th
Filesystem Type Size Used Avail Use% Mounted on
tmpfs tmpfs 388M 1.7M 386M 1% /run
/dev/mapper/ubuntu--vg-ubuntu--lv ext4 24G 9.7G 13G 44% /
tmpfs tmpfs 1.9G 0 1.9G 0% /dev/shm
tmpfs tmpfs 5.0M 0 5.0M 0% /run/lock
/dev/sda2 ext4 2.0G 253M 1.6G 14% /boot
tmpfs tmpfs 388M 4.0K 388M 1% /run/user/1000
/dev/sdb ext4 20G 24K 19G 1% /minio1
在每台主机上写一篇 docker-compose.yml 文件
services:
minio:
hostname: minio
container_name: minio
image: quay.io/minio/minio:RELEASE.2024-09-13T20-26-02Z
ports:
- "9000:9000"
- "9001:9001"
volumes:
- "/etc/localtime:/etc/localtime"
- "/ytx-data/minio/data:/data"
environment:
MINIO_ACCESS_KEY: admin
MINIO_SECRET_KEY: ytxcc123
command:
- server
- /minio1
- --console-address
- ":9001"
- --address
- ":9000"
最后通过 9001 端口访问
单机多盘
多加几块盘,有序命名
command:
- server
- /minio1
- --console-address
- ":9001"
- --address
- ":9000"
修改为
command:
- server
- /minio{1...4}
- --console-address
- ":9001"
- --address
- ":9000"
集群
多机单盘
四台主机上
services:
minio:
container_name: minio
image: quay.io/minio/minio:RELEASE.2024-09-13T20-26-02Z
network_mode: host
volumes:
- /etc/localtime:/etc/localtime:ro
- /data/minio1/data:/data
environment:
MINIO_ACCESS_KEY: admin
MINIO_SECRET_KEY: ytxcc123
command:
- server
- http://192.168.142.155/data/minio1/data
- http://192.168.142.156/data/minio1/data
- http://192.168.142.157/data/minio1/data
- http://192.168.142.158/data/minio1/data
运行
docker compose up -d
集群全部启动之后,使用 9000 端口访问
多机多盘
接上面修改一下即可
volumes:
- /data/minio1/data:/data
修改为
volumes:
- /minio/data1:/data1
- /minio/data2:/data2
- /minio/data3:/data3
- /minio/data4:/data4
command:
- server
- http://192.168.142.155/data/minio1/data
- http://192.168.142.156/data/minio1/data
- http://192.168.142.157/data/minio1/data
- http://192.168.142.158/data/minio1/data
修改为
command:
- server
- http://192.168.142.155/data/minio{1...4}/data{1...4}
新增
extra_hosts:
- minio1:192.168.142.155
- minio2:192.168.142.156
- minio3:192.168.142.157
- minio4:192.168.142.158
environment:
MINIO_UPDATE: off
command:
- --console-address "0.0.0.0:9001"
- --address "0.0.0.0:9000"
privileged: true
运行
docker compose up -d
配置负载均衡
upstream minio {
server 192.168.142.155:9000;
server 192.168.142.156:9000;
server 192.168.142.157:9000;
server 192.168.142.158:9000;
}
upstream console {
ip_hash;
server 192.168.142.155:9001;
server 192.168.142.156:9002;
server 192.168.142.157:9003;
server 192.168.142.158:9004;
}
server {
listen 9000;
listen [::]:9000;
server_name localhost;
# To allow special characters in headers
ignore_invalid_headers off;
# Allow any size file to be uploaded.
# Set to a value such as 1000m; to restrict file size to a specific value
client_max_body_size 0;
# To disable buffering
proxy_buffering off;
location / {
proxy_set_header Host $http_host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_connect_timeout 300;
# Default is HTTP/1, keepalive is only enabled in HTTP/1.1
proxy_http_version 1.1;
proxy_set_header Connection "";
chunked_transfer_encoding off;
proxy_pass http://minio;
}
}
server {
listen 9001;
listen [::]:9001;
server_name localhost;
# To allow special characters in headers
ignore_invalid_headers off;
# Allow any size file to be uploaded.
# Set to a value such as 1000m; to restrict file size to a specific value
client_max_body_size 0;
# To disable buffering
proxy_buffering off;
location / {
proxy_set_header Host $http_host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header X-NginX-Proxy true;
# This is necessary to pass the correct IP to be hashed
real_ip_header X-Real-IP;
proxy_connect_timeout 300;
# To support websocket
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
chunked_transfer_encoding off;
proxy_pass http://console;
}
}
调优
cat > sysctl.conf <<EOF
# maximum number of open files/file descriptors
fs.file-max = 4194303
# use as little swap space as possible
vm.swappiness = 1
# prioritize application RAM against disk/swap cache
vm.vfs_cache_pressure = 50
# minimum free memory
vm.min_free_kbytes = 1000000
# follow mellanox best practices https://community.mellanox.com/s/article/linux-sysctl-tuning
# the following changes are recommended for improving IPv4 traffic performance by Mellanox
# disable the TCP timestamps option for better CPU utilization
net.ipv4.tcp_timestamps = 0
# enable the TCP selective acks option for better throughput
net.ipv4.tcp_sack = 1
# increase the maximum length of processor input queues
net.core.netdev_max_backlog = 250000
# increase the TCP maximum and default buffer sizes using setsockopt()
net.core.rmem_max = 4194304
net.core.wmem_max = 4194304
net.core.rmem_default = 4194304
net.core.wmem_default = 4194304
net.core.optmem_max = 4194304
# increase memory thresholds to prevent packet dropping:
net.ipv4.tcp_rmem = "4096 87380 4194304"
net.ipv4.tcp_wmem = "4096 65536 4194304"
# enable low latency mode for TCP:
net.ipv4.tcp_low_latency = 1
# the following variable is used to tell the kernel how much of the socket buffer
# space should be used for TCP window size, and how much to save for an application
# buffer. A value of 1 means the socket buffer will be divided evenly between.
# TCP windows size and application.
net.ipv4.tcp_adv_win_scale = 1
# maximum number of incoming connections
net.core.somaxconn = 65535
# maximum number of packets queued
net.core.netdev_max_backlog = 10000
# queue length of completely established sockets waiting for accept
net.ipv4.tcp_max_syn_backlog = 4096
# time to wait (seconds) for FIN packet
net.ipv4.tcp_fin_timeout = 15
# disable icmp send redirects
net.ipv4.conf.all.send_redirects = 0
# disable icmp accept redirect
net.ipv4.conf.all.accept_redirects = 0
# drop packets with LSR or SSR
net.ipv4.conf.all.accept_source_route = 0
# MTU discovery, only enable when ICMP blackhole detected
net.ipv4.tcp_mtu_probing = 1
EOF