本文分享自华为云社区《GaussDB(DWS)的cgroup、资源池、用户的关系》,作者: nullptr_。
1. 前言
本文主要展示了DWS中cgroup、资源池、用户之间的关系,从而对DWS的资源设置情况有个初步了解。
2. 相关对象创建脚本
ini
gs_ssh -c "gs_cgroup -cS ClassN1 -G wn1"
gs_ssh -c "gs_cgroup -cS ClassN1 -G wn2"
gs_ssh -c "gs_cgroup -cS ClassN2 -G wn3"
gs_ssh -c "gs_cgroup -cS ClassG1 -G wg1_1"
gs_ssh -c "gs_cgroup -cS ClassG1 -G wg1_2"
gs_ssh -c "gs_cgroup -cS ClassG2 -G wg2_1"
gs_ssh -c "gs_cgroup -cS ClassG2 -G wg2_2"
#创建资源池
gsql -d postgres -p 6000 -c "create resource pool respool_1 with (control_group = 'ClassN1:wn1');"
gsql -d postgres -p 6000 -c "create resource pool respool_2 with (control_group = 'ClassN1:wn2');"
gsql -d postgres -p 6000 -c "create resource pool respool_3 with (control_group = 'ClassN2:wn3');"
gsql -d postgres -p 6000 -c "create resource pool respool_4 with (control_group = 'ClassN2:wn3');"
gsql -d postgres -p 6000 -c "create resource pool respool_grp_1 with (control_group = 'ClassG1');"
gsql -d postgres -p 6000 -c "create resource pool respool_g1_job_1 with (control_group = 'ClassG1:wg1_1');"
gsql -d postgres -p 6000 -c "create resource pool respool_g1_job_2 with (control_group = 'ClassG1:wg1_2');"
gsql -d postgres -p 6000 -c "create resource pool respool_grp_2 with (control_group = 'ClassG2');"
gsql -d postgres -p 6000 -c "create resource pool respool_g2_job_1 with (control_group = 'ClassG2:wg2_1');"
gsql -d postgres -p 6000 -c "create resource pool respool_g2_job_2 with (control_group = 'ClassG2:wg2_2');"
#创建租户,创建用户
gsql -d postgres -p 6000 -c "CREATE USER user_1 RESOURCE POOL 'respool_1' PASSWORD 'Gauss_ab1' ;"
gsql -d postgres -p 6000 -c "CREATE USER user_2 RESOURCE POOL 'respool_2' PASSWORD 'Gauss_ab1' ;"
gsql -d postgres -p 6000 -c "CREATE USER user_3 RESOURCE POOL 'respool_3' PASSWORD 'Gauss_ab1' ;"
gsql -d postgres -p 6000 -c "CREATE USER user_4 PASSWORD 'Gauss_ab1' ;"
gsql -d postgres -p 6000 -c "CREATE USER user_5 PASSWORD 'Gauss_ab1' ;"
gsql -d postgres -p 6000 -c "CREATE USER user_grp_1 RESOURCE POOL 'respool_grp_1' PASSWORD 'Gauss_ab1' ;"
gsql -d postgres -p 6000 -c "CREATE USER user_g1_job_1 RESOURCE POOL 'respool_g1_job_1' USER GROUP 'user_grp_1' PASSWORD 'Gauss_ab1' ;"
gsql -d postgres -p 6000 -c "CREATE USER user_g1_job_2 RESOURCE POOL 'respool_g1_job_2' USER GROUP 'user_grp_1' PASSWORD 'Gauss_ab1' ;"
gsql -d postgres -p 6000 -c "CREATE USER user_grp_2 RESOURCE POOL 'respool_grp_2' PASSWORD 'Gauss_ab1' ;"
gsql -d postgres -p 6000 -c "CREATE USER user_g2_job_1 RESOURCE POOL 'respool_g2_job_1' USER GROUP 'user_grp_2' PASSWORD 'Gauss_ab1' ;"
gsql -d postgres -p 6000 -c "CREATE USER user_g2_job_2 RESOURCE POOL 'respool_g2_job_2' USER GROUP 'user_grp_2' PASSWORD 'Gauss_ab1' ;"
gsql -d postgres -p 6000 -c "CREATE USER user_grp_3 RESOURCE POOL 'respool_grp_1' PASSWORD 'Gauss_ab1' ;"
3. cgroup
GaussDB(DWS)资源负载管理的核心是资源池,而配置资源池首先要在环境中实现控制组Cgroups的设置。
Class控制组为数据库业务运行所在的顶层控制组,集群部署时会自动生成默认子Class控制组"DefaultClass"。DefaultClass的Medium控制组会含有系统触发的作业在运行,该控制组不允许进行资源修改,且运行在该控制组上的作业不受资源管理的控制,所以推荐创建新的子Class及其Workload控制组来设置资源比例。
3.1 运行脚本之后cgroup分布情况如下
css
per910mas@xx:~> gs_cgroup -p
Top Group information is listed:
GID: 0 Type: Top Percent(%): 1000( 50) Name: Root Cores: 0-103
GID: 1 Type: Top Percent(%): 833( 83) Name: Gaussdb:per910mas Cores: 0-103
GID: 2 Type: Top Percent(%): 333( 40) Name: Backend Cores: 0-103
GID: 3 Type: Top Percent(%): 499( 60) Name: Class Cores: 0-103
Backend Group information is listed:
GID: 4 Type: BAKWD Name: DefaultBackend TopGID: 2 Percent(%): 266(80) Cores: 0-103
GID: 5 Type: BAKWD Name: Vacuum TopGID: 2 Percent(%): 66(20) Cores: 0-103
Class Group information is listed:
GID: 20 Type: CLASS Name: DefaultClass TopGID: 3 Percent(%): 99(20) MaxLevel: 1 RemPCT: 100 Cores: 0-103
GID: 21 Type: CLASS Name: ClassN1 TopGID: 3 Percent(%): 99(20) MaxLevel: 3 RemPCT: 60 Cores: 0-103
GID: 22 Type: CLASS Name: ClassN2 TopGID: 3 Percent(%): 99(20) MaxLevel: 2 RemPCT: 80 Cores: 0-103
GID: 23 Type: CLASS Name: ClassG1 TopGID: 3 Percent(%): 99(20) MaxLevel: 3 RemPCT: 60 Cores: 0-103
GID: 24 Type: CLASS Name: ClassG2 TopGID: 3 Percent(%): 99(20) MaxLevel: 3 RemPCT: 60 Cores: 0-103
Workload Group information is listed:
GID: 86 Type: DEFWD Name: wn1:2 ClsGID: 21 Percent(%): 19(20) WDLevel: 2 Cores: 0-103
GID: 87 Type: DEFWD Name: wn2:3 ClsGID: 21 Percent(%): 19(20) WDLevel: 3 Cores: 0-103
GID: 89 Type: DEFWD Name: wn3:2 ClsGID: 22 Percent(%): 19(20) WDLevel: 2 Cores: 0-103
GID: 91 Type: DEFWD Name: wg1_1:2 ClsGID: 23 Percent(%): 19(20) WDLevel: 2 Cores: 0-103
GID: 92 Type: DEFWD Name: wg1_2:3 ClsGID: 23 Percent(%): 19(20) WDLevel: 3 Cores: 0-103
GID: 94 Type: DEFWD Name: wg2_1:2 ClsGID: 24 Percent(%): 19(20) WDLevel: 2 Cores: 0-103
GID: 95 Type: DEFWD Name: wg2_2:3 ClsGID: 24 Percent(%): 19(20) WDLevel: 3 Cores: 0-103
CM Group information is listed:
Timeshare Group information is listed:
GID: 724 Type: TSWD Name: Low Rate: 1
GID: 725 Type: TSWD Name: Medium Rate: 2
GID: 726 Type: TSWD Name: High Rate: 4
GID: 727 Type: TSWD Name: Rush Rate: 8
系统资源限制分为配额
和限额
。默认情况下为配额
- 配额:配额是一种比较灵活的控制方式,例如
wn1:2
的配额为20%,在正常情况下组内资源使用可以超过20%,如果在资源繁忙的情况下(使用率为100%)则资源严格按照配额的大小进行限制 - 限额:限额则直接限制CPU使用的核数的范围。
- 配额&限额:则在CPU核数范围内限制配额比例
4. 资源池
4.1 资源池分布情况如下
sql
postgres=# select oid,* from pg_resource_pool;
oid | respool_name | mem_percent | cpu_affinity | control_group | active_statements | max_dop | memory_limit | parentid | io_limits | io_priority | nodegroup | is_foreign | short_acc | except_rule | weight
------------+------------------+-------------+--------------+---------------------+-------------------+---------+--------------+------------+-----------+-------------+--------------+------------+-----------+-------------+--------
10 | default_pool | 0 | -1 | DefaultClass:Medium | -1 | -1 | default | 0 | 0 | None | installation | f | t | None | -1
2147585814 | respool_1 | 0 | -1 | ClassN1:wn1 | 10 | -1 | default | 0 | 0 | None | installation | f | t | None | -1
2147585815 | respool_2 | 0 | -1 | ClassN1:wn2 | 10 | -1 | default | 0 | 0 | None | installation | f | t | None | -1
2147585816 | respool_3 | 0 | -1 | ClassN2:wn3 | 10 | -1 | default | 0 | 0 | None | installation | f | t | None | -1
2147585817 | respool_grp_1 | 20 | -1 | ClassG1 | 10 | -1 | default | 0 | 0 | None | installation | f | t | None | -1
2147585818 | respool_g1_job_1 | 20 | -1 | ClassG1:wg1_1 | 10 | -1 | default | 2147585817 | 0 | None | installation | f | t | None | -1
2147585819 | respool_g1_job_2 | 20 | -1 | ClassG1:wg1_2 | 10 | -1 | default | 2147585817 | 0 | None | installation | f | t | None | -1
2147585820 | respool_grp_2 | 20 | -1 | ClassG2 | 10 | -1 | default | 0 | 0 | None | installation | f | t | None | -1
2147585821 | respool_g2_job_1 | 20 | -1 | ClassG2:wg2_1 | 10 | -1 | default | 2147585820 | 0 | None | installation | f | t | None | -1
2147585822 | respool_g2_job_2 | 20 | -1 | ClassG2:wg2_2 | 10 | -1 | default | 2147585820 | 0 | None | installation | f | t | None | -1
2147586195 | respool_4 | 0 | -1 | ClassN2:wn3 | 10 | -1 | default | 0 | 0 | None | installation | f | t | None | -1
(11 rows)
4.1.1 组资源池限制
ini
per910mas@xx:~> gsql -d postgres -p 6000 -c "create resource pool respool_grp_3 with (control_group = 'ClassG1');"
ERROR: resource pool with control_group ClassG1 has been existed in the two-layer resource pool list
4.1.2 业务资源池
资源池的内存资源计算mem_percent
需要按照层级进行比例计算
4.1.3 默认资源池
如果开启了资源管理功能,则系统会默认创建一个资源池default_pool
。当一个会话或者用户没有指定关联的资源池时,都会被默认关联到default_pool。default_pool默认绑定DefaultClass:Medium控制组,同时并发和内存默认不管控,default_pool支持参数修改,但关联default_pool的作业会受到max_active_statements全局并发限制。当管理员执行运维操作不需要进行管控时,需要在执行SQL前执行SET session_respool='root';将资源池切换至运维队列,此时作业将不受控。
5. 用户
5.1 用户分布情况
markdown
postgres=# select * from pg_user;、
usename | usesysid | usecreatedb | usesuper | usecatupd | userepl | passwd | valbegin | valuntil | respool | parent | spacelimit | useconfig | nodegroup | tempspacelimit | spillspacelimit
---------------+------------+-------------+----------+-----------+---------+----------+----------+----------+------------------+------------+------------+-----------+-----------+----------------+-----------------
per910mas | 10 | t | t | t | t | ******** | | | default_pool | 0 | | | | |
u1 | 2147558961 | f | f | f | f | ******** | | | default_pool | 0 | | | | |
user_1 | 2147585823 | f | f | f | f | ******** | | | respool_1 | 0 | | | | |
user_2 | 2147585827 | f | f | f | f | ******** | | | respool_2 | 0 | | | | |
user_3 | 2147585831 | f | f | f | f | ******** | | | respool_3 | 0 | | | | |
user_4 | 2147585835 | f | f | f | f | ******** | | | default_pool | 0 | | | | |
user_5 | 2147585839 | f | f | f | f | ******** | | | default_pool | 0 | | | | |
user_grp_1 | 2147585843 | f | f | f | f | ******** | | | respool_grp_1 | 0 | | | | |
user_g1_job_1 | 2147585847 | f | f | f | f | ******** | | | respool_g1_job_1 | 2147585843 | | | | |
user_g1_job_2 | 2147585851 | f | f | f | f | ******** | | | respool_g1_job_2 | 2147585843 | | | | |
user_grp_2 | 2147585855 | f | f | f | f | ******** | | | respool_grp_2 | 0 | | | | |
user_g2_job_1 | 2147585859 | f | f | f | f | ******** | | | respool_g2_job_1 | 2147585855 | | | | |
user_g2_job_2 | 2147585863 | f | f | f | f | ******** | | | respool_g2_job_2 | 2147585855 | | | | |
user_grp_3 | 2147586254 | f | f | f | f | ******** | | | respool_grp_1 | 0 | | | | |
(14 rows)
5.1.2 多租户场景
- 业务用户共享组用户的资源,组用户共享其所在资源池的资源。
- 业务用户必须挂在到组用户下,且层级必须与资源池层级一一对应