一、查看数据库和表信息
(一)查看数据库版本
- SELECT version ()
PostgreSQL 9.2.4 (GaussDB 9.1.0 build 53f3b265) compiled at 2025-03-21 14:49:19 commit 9292 last mr 24793
(二)查看表信息
-
使用
\d
** 命令** :在 gsql 命令行工具中,执行\d table_name
查看表的基本信息,包括列、数据类型、索引等。 -
使用
\d+
** 命令** :执行\d+ table_name
查看更详细的表信息,如列的详细信息、表的存储选项等。 -
查询
information_schema.columns
** 视图** :执行SELECT * FROM information_schema.columns WHERE table_name = 'table_name'
,获取指定表的列信息,包括列名、数据类型、是否可为空等。 -
查询
pg_tables
** 系统表** :执行SELECT * FROM pg_tables WHERE tablename = 'table_name'
,查看表的基本信息,如表名、所属模式、表空间等。
select
count
(1) from table_for_test_index
-- 总数 4579369
二、索引管理
(一)查看索引
-
使用
\d
** 命令** :在 gsql 命令行工具中,执行\d table_name
查看表的索引信息。 -
查询
pg_indexes
** 系统表** :执行SELECT * FROM pg_indexes WHERE tablename = 'table_name'
,查看指定表的索引信息,包括索引名、表名、索引定义等。
(二)创建索引
-
创建普通索引 :
CREATE INDEX index_name ON table_name (column_name);
-
创建唯一索引 :
CREATE UNIQUE INDEX index_name ON table_name (column_name);
三、查询优化
(一)执行计划分析
- 使用
EXPLAIN ANALYZE
:查看查询的执行计划,分析查询的性能。关键指标包括操作节点的执行时间(A-time)、返回的行数(A-rows)、估计的行数(E-rows)、内存使用(Peak Memory)等。
(二)优化建议
-
增加索引 :为查询条件中的列创建索引,加速查询过滤和排序。
-
优化查询语句 :确保查询条件的顺序合理,使数据库可以利用索引。
-
定期维护索引 :定期对表进行
VACUUM
和ANALYZE
操作,以保持索引和统计信息的准确性。
四、数据库维护操作
(一)VACUUM
操作
-
作用 :回收删除或更新操作产生的废弃空间,更新数据库的可见性信息,减少磁盘空间的浪费,提高查询性能。
-
使用方法 :
VACUUM [VERBOSE] [TABLE_NAME];
-
示例 :
VACUUM VERBOSE table_for_test_index;
(二)ANALYZE
操作
-
作用 :收集表的统计信息,帮助查询规划器生成更优的查询执行计划。
-
使用方法 :
ANALYZE [VERBOSE] [TABLE_NAME];
-
示例 :
ANALYZE VERBOSE table_for_test_index;
(三)组合命令
-
VACUUM ANALYZE
:先执行VACUUM
回收空间,然后执行ANALYZE
更新统计信息。 -
示例 :
VACUUM ANALYZE activity;
五、案例分析
create table table_for_test_index
(
data_id bigint,
inspect_status text,
item_name varchar(4000),
item_content varchar(4000),
item_result text,
province_code varchar(400),
province_name varchar(400),
area_code varchar(400),
area_name varchar(400),
sub_dealer_id integer,
sub_dealer_name varchar(400),
team_id integer,
team_name varchar(400),
)
-- 计划创建索引 --场景二 单一索引 CREATE INDEX idx_
table_for_test_index_confirm_first_time ON table_for_test_index(confirm_first_time); --场景四 联合索引 CREATE INDEX idx_confirm_first_time_team_id ON table_for_test_index (confirm_first_time, team_id);
-- 执行目标sql
explain analyse
select order_no,
item_name,
item_content,
item_result,
province_name,
area_name,
dealer_name,
dealer_team_name,
supo_dealer_manager_name,
to_char(fst_pass_time, 'YYYY-MM-DD') fst_pass_time,
to_char(confirm_first_time, 'YYYY-MM-DD') as confirm_first_time
from table_for_test_index
where 1 = 1
AND confirm_first_time >= '2025-02-28T00:00'
and '2025-05-22T23:59:59.999999999' >= confirm_first_time
and item_name in ('固定', '焊接', '线缆', '阴影')
and item_result in ('不合格')
and area_code in
('aa','cc','vv','ss')
and province_code in ('a')
and sub_dealer_id in (11)
and team_id in (1122)
order by data_id
LIMIT 10
(一)场景 1:不加索引
id | operation | A-time | A-rows | E-rows | Peak Memory | E-memory | A-width | E-width | E-costs
----+------------------------------------------------------------+----------------------+--------+--------+----------------+----------+---------+---------+-----------
1 | -> Limit | 1448.363 | 1 | 10 | 8KB | | | 307 | 136675.50
2 | -> Streaming (type: GATHER) | 1448.360 | 1 | 30 | 160KB | | | 307 | 136679.86
" 3 | -> Limit | [1396.774, 1425.354] | 1 | 30 | [8KB, 8KB] | 1MB | | 307 | 136671.86 "
" 4 | -> Sort | [1396.769, 1425.350] | 1 | 663 | [40KB, 40KB] | 16MB | [0,360] | 307 | 136672.38 "
" 5 | -> Subquery Scan on table_for_test_index | [1396.596, 1425.256] | 1 | 662 | [24KB, 32KB] | 1MB | | 307 | 136667.05 "
" 6 | -> Hash Semi Join (7, 8) | [1396.594, 1425.183] | 1 | 662 | [24KB, 32KB] | 1MB | | 307 | 136665.94 "
" 7 | -> Seq Scan on table_for_test_index | [1396.586, 1424.857] | 1 | 662 | [80KB, 80KB] | 1MB | | 314 | 136662.74 "
" 8 | -> Hash | [0.076, 0.076] | 88 | 264 | [296KB, 296KB] | 16MB | [23,23] | 32 | 1.10 "
" 9 | -> Values Scan on ""*VALUES*"" | [0.031, 0.031] | 88 | 264 | [16KB, 16KB] | 1MB | | 32 | 1.10 "------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
" 6 --Hash Semi Join (7, 8)"
" Hash Cond: ((table_for_test_index.area_code)::text = ""*VALUES*"".column1)"
7 --Seq Scan on table_for_test_index
" Filter: ((confirm_first_time >= '2025-02-28 00:00:00'::timestamp without time zone) AND ('2025-05-23 00:00:00'::timestamp without time zone >= confirm_first_time) AND (item_result = '不合格'::text) AND ((province_code)::text = '45'::text) AND (sub_dealer_id = 1077) AND (team_id = 1122) AND ((item_name)::text = ANY ('{压块固定问题,焊接防腐,线缆虚接,阴影遮挡}'::text[])))"
Rows Removed by Filter: 4579368
""
Memory Information (identified by plan id)
-----------------------------------------------------------------
Coordinator Query Peak Memory:
Query Peak Memory: 0MB
Datanode:
Max Query Peak Memory: 0MB
Min Query Peak Memory: 0MB
4 --Sort
Sort Method: quicksort Memory: 25kB ~ 25kB
8 --Hash
Max Buckets: 32768 Max Batches: 1 Max Memory Usage: 4kB
Min Buckets: 0 Min Batches: 0 Min Memory Usage: 0kB
""
User Define Profiling
----------------------------------------------------------------------------------------
Plan Node id: 2 Track name: coordinator get datanode connection
" (actual time=[9.030, 9.030], calls=[1, 1])"
Plan Node id: 2 Track name: coordinator begin transaction
" (actual time=[0.002, 0.002], calls=[1, 1])"
Plan Node id: 2 Track name: coordinator send command
" (actual time=[0.091, 0.091], calls=[3, 3])"
Plan Node id: 2 Track name: coordinator get the first tuple
" (actual time=[37.865, 37.865], calls=[1, 1])"
Plan Node id: 7 Track name: heap scan read buffer
" (sample time=[72.655, 77.479], estimated time=[726.567, 774.834], calls=[87041, 87065])"
""
====== Query Summary =====
----------------------------------------------------------------------------------
"Datanode executor start time [dn_6005_6006, dn_6001_6002]: [0.358 ms,0.399 ms]"
"Datanode executor run time [dn_6005_6006, dn_6003_6004]: [1398.015 ms,1435.453 ms]"
"Datanode executor end time [dn_6003_6004, dn_6001_6002]: [0.031 ms,0.037 ms]"
System available mem: 4546560KB
Query Max mem: 4546560KB
Query estimated mem: 32768KB
Enqueue time: 34.512 ms
Coordinator executor start time: 0.203 ms
Coordinator executor run time: 1450.309 ms
Coordinator executor end time: 0.041 ms
Parser runtime: 0.000 ms
Planner runtime: 2.931 ms
Query Id: 73464969046324330
Unique SQL Id: 2238842644
Unique SQL Hash: sql_07acf26f0590c5ec805e5875efec3870
Total runtime: 1488.180 ms
-
执行时间 :总执行时间为
1488.180 ms
。 -
主要操作节点 :
-
顺序扫描(Seq Scan) :由于没有索引,数据库需要全表顺序扫描来查找符合条件的记录,这导致大量数据被扫描(
Rows Removed by Filter: 4579368
),占用了大部分查询时间。 -
哈希半连接(Hash Semi Join) :用于连接两个数据集。一个数据集来自顺序扫描,另一个来自一个值扫描(Values Scan)。连接条件是
install_area_code
。 -
排序(Sort) :在结果集中按
data_id
排序,使用了快速排序算法,内存使用为 25kB。 -
限制(Limit) :限制返回的行数为 10,有助于减少最终返回的数据量。
-
-
问题分析 :顺序扫描导致查询性能较差,大量数据被扫描后又被过滤掉,浪费了大量时间。
(二)场景 2:增加 confirm_first_time
索引
id | operation | A-time | A-rows | E-rows | Peak Memory | E-memory | A-width | E-width | E-costs
----+--------------------------------------------------------------------+--------------------+---------+---------+----------------+----------+---------+---------+-----------
1 | -> Limit | 636.978 | 1 | 10 | 8KB | | | 307 | 102387.55
2 | -> Streaming (type: GATHER) | 636.974 | 1 | 30 | 160KB | | | 307 | 102391.90
" 3 | -> Limit | [614.182, 626.947] | 1 | 30 | [8KB, 8KB] | 1MB | | 307 | 102383.90 "
" 4 | -> Sort | [614.180, 626.944] | 1 | 663 | [40KB, 40KB] | 16MB | [0,360] | 307 | 102384.43 "
" 5 | -> Subquery Scan on table_for_test_index | [614.131, 626.898] | 1 | 662 | [24KB, 32KB] | 1MB | | 307 | 102379.10 "
" 6 | -> Hash Semi Join (7, 9) | [614.128, 626.895] | 1 | 662 | [24KB, 32KB] | 1MB | | 307 | 102377.99 "
" 7 | -> Bitmap Heap Scan on table_for_test_index | [613.883, 626.598] | 1 | 662 | [72KB, 72KB] | 1MB | | 314 | 102374.79 "
" 8 | -> Bitmap Index Scan | [69.465, 75.358] | 1082697 | 1081447 | [4MB, 4MB] | 1MB | | 0 | 3609.07 "
" 9 | -> Hash | [0.071, 0.083] | 264 | 264 | [296KB, 296KB] | 16MB | [23,23] | 32 | 1.10 "
" 10 | -> Values Scan on ""*VALUES*"" | [0.027, 0.045] | 264 | 264 | [16KB, 16KB] | 1MB | | 32 | 1.10 "
""
Predicate Information (identified by plan id)
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
" 6 --Hash Semi Join (7, 9)"
" Hash Cond: ((table_for_test_index.area_code)::text = ""*VALUES*"".column1)"
7 --Bitmap Heap Scan on table_for_test_index
Recheck Cond: ((confirm_frist_time >= '2025-02-28 00:00:00'::timestamp without time zone) AND ('2025-05-23 00:00:00'::timestamp without time zone >= confirm_time))
" Filter: ((item_result = '不合格'::text) AND ((province_code)::text = '45'::text) AND (sub_dealer_id = 1077) AND (team_id = 1122) AND ((item_name)::text = ANY ('{压块,焊接,线缆,阴影}'::text[])))"
Rows Removed by Filter: 1082696
8 --Bitmap Index Scan
Index Cond: ((confirm_first_time >= '2025-02-28 00:00:00'::timestamp without time zone) AND ('2025-05-23 00:00:00'::timestamp without time zone >= confirm_first_time))
""
Memory Information (identified by plan id)
-----------------------------------------------------------------
Coordinator Query Peak Memory:
Query Peak Memory: 2MB
Datanode:
Max Query Peak Memory: 5MB
Min Query Peak Memory: 5MB
4 --Sort
Sort Method: quicksort Memory: 25kB ~ 25kB
9 --Hash
Max Buckets: 32768 Max Batches: 1 Max Memory Usage: 4kB
Min Buckets: 32768 Min Batches: 1 Min Memory Usage: 4kB
""
User Define Profiling
----------------------------------------------------------------------------------------
Plan Node id: 2 Track name: coordinator get datanode connection
" (actual time=[5.730, 5.730], calls=[1, 1])"
Plan Node id: 2 Track name: coordinator begin transaction
" (actual time=[0.003, 0.003], calls=[1, 1])"
Plan Node id: 2 Track name: coordinator send command
" (actual time=[0.141, 0.141], calls=[3, 3])"
Plan Node id: 2 Track name: coordinator get the first tuple
" (actual time=[13.444, 13.444], calls=[1, 1])"
Plan Node id: 8 Track name: bitmap heap scan read buffer
" (sample time=[35.496, 36.298], estimated time=[355.004, 363.007], calls=[40175, 40206])"
""
====== Query Summary =====
--------------------------------------------------------------------------------
"Datanode executor start time [dn_6005_6006, dn_6003_6004]: [0.413 ms,0.486 ms]"
"Datanode executor run time [dn_6005_6006, dn_6003_6004]: [614.189 ms,626.956 ms]"
"Datanode executor end time [dn_6005_6006, dn_6001_6002]: [0.054 ms,0.156 ms]"
System available mem: 4546560KB
Query Max mem: 4546560KB
Query estimated mem: 32768KB
Enqueue time: 5.010 ms
Coordinator executor start time: 0.317 ms
Coordinator executor run time: 637.256 ms
Coordinator executor end time: 0.043 ms
Parser runtime: 0.000 ms
Planner runtime: 4.348 ms
Query Id: 73464969046932357
Unique SQL Id: 2238842644
Unique SQL Hash: sql_07acf26f0590c5ec805e5875efec3870
Total runtime: 647.191 ms
-
执行时间 :总执行时间为
647.191 ms
。 -
主要操作节点 :
-
位图 索引扫描( Bitmap Index Scan) :利用
confirm_first_time
索引快速找到符合条件的记录。这一操作实际返回了1082697
行,实际时间为[69.465, 75.358]
毫秒,相比顺序扫描,大大减少了需要处理的数据量。 -
位图 堆扫描( Bitmap Heap Scan) :根据位图索引扫描的结果,从堆中获取完整的行数据,并应用其他过滤条件(如
inspect_item_result
、srt_unqualified_cnt_flag
等)。 -
排序(Sort) :在结果集中按
data_id
排序,使用了快速排序算法,内存使用为 25kB。 -
限制(Limit) :限制返回的行数为 10,有助于减少最终返回的数据量。
-
-
优化效果 :相比未增加索引时,查询时间减少了一半以上,说明索引的添加对查询性能有显著的提升。
(三)场景 3:增加 confirm_first_time
和 data_id
索引
id | operation | A-time | A-rows | E-rows | Peak Memory | E-memory | A-width | E-width | E-costs
----+--------------------------------------------------------------------+--------------------+---------+---------+----------------+----------+---------+---------+-----------
1 | -> Limit | 628.006 | 1 | 10 | 8KB | | | 307 | 102387.55
2 | -> Streaming (type: GATHER) | 628.002 | 1 | 30 | 160KB | | | 307 | 102391.90
" 3 | -> Limit | [610.628, 617.545] | 1 | 30 | [8KB, 8KB] | 1MB | | 307 | 102383.90 "
" 4 | -> Sort | [610.625, 617.543] | 1 | 663 | [40KB, 40KB] | 16MB | [0,360] | 307 | 102384.43 "
" 5 | -> Subquery Scan on table_for_test_index | [610.567, 617.491] | 1 | 662 | [24KB, 32KB] | 1MB | | 307 | 102379.10 "
" 6 | -> Hash Semi Join (7, 9) | [610.491, 617.487] | 1 | 662 | [24KB, 32KB] | 1MB | | 307 | 102377.99 "
" 7 | -> Bitmap Heap Scan on table_for_test_index | [610.248, 617.215] | 1 | 662 | [72KB, 72KB] | 1MB | | 314 | 102374.79 "
" 8 | -> Bitmap Index Scan | [68.714, 82.117] | 1082697 | 1081447 | [4MB, 4MB] | 1MB | | 0 | 3609.07 "
" 9 | -> Hash | [0.071, 0.074] | 264 | 264 | [296KB, 296KB] | 16MB | [23,23] | 32 | 1.10 "
" 10 | -> Values Scan on ""*VALUES*"" | [0.024, 0.028] | 264 | 264 | [16KB, 16KB] | 1MB | | 32 | 1.10 "
""
Predicate Information (identified by plan id)
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
" 6 --Hash Semi Join (7, 9)"
" Hash Cond: ((table_for_test_index.area_code)::text = ""*VALUES*"".column1)"
7 --Bitmap Heap Scan on table_for_test_index
Recheck Cond: ((confirm_time >= '2025-02-28 00:00:00'::timestamp without time zone) AND ('2025-05-23 00:00:00'::timestamp without time zone >= confirm_time))
" Filter: ((item_result = '不合格'::text) AND AND ((province_code)::text = '45'::text) AND (sub_dealer_id = 1077) AND (team_id = 1122) AND ((item_name)::text = ANY ('{压块,焊接,线缆,阴影}'::text[])))"
Rows Removed by Filter: 1082696
8 --Bitmap Index Scan
Index Cond: ((confirm_first_time >= '2025-02-28 00:00:00'::timestamp without time zone) AND ('2025-05-23 00:00:00'::timestamp without time zone >= confirm_first_time))
""
Memory Information (identified by plan id)
-----------------------------------------------------------------
Coordinator Query Peak Memory:
Query Peak Memory: 0MB
Datanode:
Max Query Peak Memory: 5MB
Min Query Peak Memory: 5MB
4 --Sort
Sort Method: quicksort Memory: 25kB ~ 25kB
9 --Hash
Max Buckets: 32768 Max Batches: 1 Max Memory Usage: 4kB
Min Buckets: 32768 Min Batches: 1 Min Memory Usage: 4kB
""
User Define Profiling
----------------------------------------------------------------------------------------
Plan Node id: 2 Track name: coordinator get datanode connection
" (actual time=[5.962, 5.962], calls=[1, 1])"
Plan Node id: 2 Track name: coordinator begin transaction
" (actual time=[0.002, 0.002], calls=[1, 1])"
Plan Node id: 2 Track name: coordinator send command
" (actual time=[0.095, 0.095], calls=[3, 3])"
Plan Node id: 2 Track name: coordinator get the first tuple
" (actual time=[7.891, 7.891], calls=[1, 1])"
Plan Node id: 8 Track name: bitmap heap scan read buffer
" (sample time=[34.368, 35.879], estimated time=[343.731, 358.835], calls=[40175, 40206])"
""
====== Query Summary =====
--------------------------------------------------------------------------------
"Datanode executor start time [dn_6005_6006, dn_6003_6004]: [0.504 ms,0.547 ms]"
"Datanode executor run time [dn_6001_6002, dn_6003_6004]: [610.664 ms,617.553 ms]"
"Datanode executor end time [dn_6005_6006, dn_6001_6002]: [0.051 ms,0.067 ms]"
System available mem: 4546560KB
Query Max mem: 4546560KB
Query estimated mem: 32768KB
Enqueue time: 4.886 ms
Coordinator executor start time: 0.191 ms
Coordinator executor run time: 628.284 ms
Coordinator executor end time: 0.045 ms
Parser runtime: 0.000 ms
Planner runtime: 3.145 ms
Query Id: 73464969047523501
Unique SQL Id: 2238842644
Unique SQL Hash: sql_07acf26f0590c5ec805e5875efec3870
Total runtime: 636.740 ms
-
执行时间 :总执行时间为
628.006 ms
。 -
主要操作节点 :
-
位图 索引扫描( Bitmap Index Scan) :利用
confirm_first_time
索引快速找到符合条件的记录,实际时间为[68.714, 82.117]
毫秒。 -
位图 堆扫描( Bitmap Heap Scan) :根据位图索引扫描的结果,从堆中获取完整的行数据,并应用其他过滤条件。
-
排序(Sort) :在结果集中按
data_id
排序,使用了快速排序算法,内存使用为 25kB。由于data_id
上有索引,排序操作可能利用了该索引,进一步减少了排序的时间。 -
限制(Limit) :限制返回的行数为 10,有助于减少最终返回的数据量。
-
-
优化效果 :相比场景 2,总执行时间略有减少,说明
data_id
索引对排序操作有一定的帮助,进一步提高了查询效率。
(四)场景 4:增加 confirm_first_time
和 team_id
增加联合主键
id | operation | A-time | A-rows | E-rows | Peak Memory | E-memory | A-width | E-width | E-costs
----+--------------------------------------------------------------------+------------------+--------+--------+----------------+----------+---------+---------+---------
1 | -> Limit | 28.507 | 1 | 10 | 8KB | | | 307 | 5987.59
2 | -> Streaming (type: GATHER) | 28.504 | 1 | 30 | 96KB | | | 307 | 5991.94
" 3 | -> Limit | [22.979, 24.620] | 1 | 30 | [8KB, 8KB] | 1MB | | 307 | 5983.94 "
" 4 | -> Sort | [22.977, 24.616] | 1 | 663 | [40KB, 40KB] | 16MB | [0,360] | 307 | 5984.47 "
" 5 | -> Subquery Scan on table_for_test_index | [22.968, 24.602] | 1 | 662 | [24KB, 32KB] | 1MB | | 307 | 5979.14 "
" 6 | -> Hash Semi Join (7, 9) | [22.966, 24.584] | 1 | 662 | [24KB, 32KB] | 1MB | | 307 | 5978.02 "
" 7 | -> Bitmap Heap Scan on table_for_test_index | [22.859, 24.464] | 1 | 662 | [72KB, 72KB] | 1MB | | 314 | 5974.83 "
" 8 | -> Bitmap Index Scan | [22.762, 24.362] | 67 | 1145 | [32KB, 32KB] | 1MB | | 0 | 4510.28 "
" 9 | -> Hash | [0.064, 0.075] | 264 | 264 | [296KB, 296KB] | 16MB | [23,23] | 32 | 1.10 "
" 10 | -> Values Scan on ""*VALUES*"" | [0.023, 0.033] | 264 | 264 | [16KB, 16KB] | 1MB | | 32 | 1.10 "
""
Predicate Information (identified by plan id)
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
" 6 --Hash Semi Join (7, 9)"
" Hash Cond: ((table_for_test_index.area_code)::text = ""*VALUES*"".column1)"
7 --Bitmap Heap Scan on table_for_test_index
Recheck Cond: ((confirm_first_time >= '2025-02-28 00:00:00'::timestamp without time zone) AND ('2025-05-23 00:00:00'::timestamp without time zone >= confirm_first_time) AND (dealer_team_id = 1122))
" Filter: ((item_result = '不合格'::text) AND ((install_province_code)::text = '45'::text) AND (sub_dealer_id = 1077) AND ((item_name)::text = ANY ('{压块,焊接,线缆,阴影}'::text[])))"
Rows Removed by Filter: 66
8 --Bitmap Index Scan
Index Cond: ((confirm_first_time >= '2025-02-28 00:00:00'::timestamp without time zone) AND ('2025-05-23 00:00:00'::timestamp without time zone >= confirm_first_time) AND (team_id = 1122))
""
Memory Information (identified by plan id)
-----------------------------------------------------------------
Coordinator Query Peak Memory:
Query Peak Memory: 0MB
Datanode:
Max Query Peak Memory: 0MB
Min Query Peak Memory: 0MB
4 --Sort
Sort Method: quicksort Memory: 25kB ~ 25kB
9 --Hash
Max Buckets: 32768 Max Batches: 1 Max Memory Usage: 4kB
Min Buckets: 32768 Min Batches: 1 Min Memory Usage: 4kB
""
User Define Profiling
----------------------------------------------------------------------------
Plan Node id: 2 Track name: coordinator get datanode connection
" (actual time=[0.012, 0.012], calls=[1, 1])"
Plan Node id: 2 Track name: coordinator begin transaction
" (actual time=[0.003, 0.003], calls=[1, 1])"
Plan Node id: 2 Track name: coordinator send command
" (actual time=[0.097, 0.097], calls=[3, 3])"
Plan Node id: 2 Track name: coordinator get the first tuple
" (actual time=[2.296, 2.296], calls=[1, 1])"
Plan Node id: 8 Track name: bitmap heap scan read buffer
" (sample time=[0.000, 0.001], estimated time=[0.000, 0.011], calls=[21, 22])"
""
====== Query Summary =====
------------------------------------------------------------------------------
"Datanode executor start time [dn_6001_6002, dn_6005_6006]: [0.107 ms,0.128 ms]"
"Datanode executor run time [dn_6005_6006, dn_6001_6002]: [22.986 ms,24.641 ms]"
"Datanode executor end time [dn_6001_6002, dn_6003_6004]: [0.041 ms,0.045 ms]"
System available mem: 4546560KB
Query Max mem: 4546560KB
Query estimated mem: 4362KB
Enqueue time: 0.055 ms
Coordinator executor start time: 0.229 ms
Coordinator executor run time: 28.775 ms
Coordinator executor end time: 0.045 ms
Parser runtime: 0.000 ms
Planner runtime: 3.295 ms
Query Id: 73464969050955362
Unique SQL Id: 2238842644
Unique SQL Hash: sql_07acf26f0590c5ec805e5875efec3870
Total runtime: 32.631 ms
联合索引的创建
CREATE INDEX idx_confirm_first_time_team_id ON table_for_test_index (confirm_first_time, team_id);
-
执行时间 :总执行时间缩短到
32.631 ms
,相比未增加索引时的1488.180 ms
有显著提升,也优于仅增加单个索引的场景。 -
主要操作节点 :
-
位图 索引扫描( Bitmap Index Scan) :利用
confirm_first_time
和team_id
的联合索引快速找到符合条件的记录,实际时间为[22.762, 24.362]
毫秒。 -
位图 堆扫描( Bitmap Heap Scan) :根据位图索引扫描的结果,从堆中获取完整的行数据,并应用其他过滤条件。
-
排序(Sort) :在结果集中按
data_id
排序,使用了快速排序算法,内存使用为 25kB。 -
限制(Limit) :限制返回的行数为 10,有助于减少最终返回的数据量。
-
-
优化效果 :联合索引的使用显著减少了数据扫描的时间,查询时间大幅减少,说明索引的添加对查询性能有显著的提升。
(五)四种场景对比分析
在 GaussDB(PostgreSQL)中,根据是否添加索引及索引类型的不同,查询性能表现各异。以下是四种查询场景的总结分析:
场景一:未添加索引
-
特点:查询时需全表扫描,逐行检查所有数据,过滤掉不符合条件的记录。
-
性能表现 :在数据量较大时,查询性能较差,执行时间较长,如案例中的
1488.180 ms
。
场景二:添加 confirm_first_time
单列索引
-
特点 :利用索引快速定位
confirm_first_time
范围内的数据,减少需扫描的数据量。 -
性能表现 :查询性能显著提升,执行时间大幅减少,案例中为
647.191 ms
。
场景三:添加 confirm_first_time
和 data_id
两个单列索引
-
特点 :
confirm_first_time
索引用于快速过滤时间范围数据,data_id
索引用于加速排序。 -
性能表现 :查询性能进一步提升,执行时间略短于场景二,案例中为
628.006 ms
。
场景四:添加 confirm_first_time
和 team_id
联合索引
-
特点:联合索引同时利用两个列的过滤条件,减少扫描数据量,且无需单独维护两个单列索引。
-
性能表现 :查询性能较好,执行时间短,案例中为
32.631 ms
。
六、增加索引时可能遇到的问题及解决方案
(一)权限不足
-
问题描述 :在创建索引时,可能会出现权限不足的错误,提示用户没有足够的权限来修改表结构。
-
解决方法 :确保当前用户具有对表的
ALTER
权限。可以使用以下命令授予权限:- 授予权限 :
GRANT ALTER ON TABLE table_name TO username;
- 授予权限 :
(二)表结构冲突
- 问题描述 :如果表是分布式表(如使用
DISTRIBUTE BY HASH
创建的表),可能会出现不支持主键或唯一索引的错误。
[2025-05-20 15:20:45] [0A000] ERROR: Round-Robin table does not support PrimaryKey/UniqueIndex
-
解决方法 :对于 GaussDB(PostgreSQL),分布式表不支持主键和唯一索引。可以创建普通索引来提高查询性能:
- 创建普通索引 :
CREATE INDEX idx_column ON table_name (column_name);
- 创建普通索引 :
(三)索引创建失败
-
问题描述 :在创建索引时,可能会出现索引创建失败的错误,提示索引名重复或列类型不支持。
-
解决方法 :
-
检查索引名 :确保索引名不重复。
-
检查列类型 :确保列类型支持索引创建。对于不支持的列类型,可以考虑使用函数索引或修改列类型。
-
(四)性能问题
-
问题描述 :在创建索引时,可能会对表的写入性能产生影响,特别是对于大表来说,索引创建过程可能会比较耗时,并占用较多的磁盘空间。
-
解决方法 :
-
选择合适的时间 :在低峰期创建索引,以减少对业务的影响。
-
使用并发创建索引 :可以使用
CONCURRENTLY
关键字创建索引,减少对表的写入锁定。例如:- 并发创建索引 :
CREATE INDEX CONCURRENTLY idx_column ON table_name (column_name);
- 并发创建索引 :
-
(五)索引维护成本
-
问题描述 :索引会增加数据插入、更新和删除操作的维护成本,因为每次数据变更都需要更新索引。
-
解决方法 :
-
权衡索引数量 :根据查询需求合理创建索引,避免创建过多的索引。
-
定期维护索引 :定期对索引进行维护,如重建索引以提高性能。可以使用以下命令重建索引:
- 重建索引 :
REINDEX INDEX index_name;
- 重建索引 :
-
七、总结
-
定期执行
VACUUM
** 和 **ANALYZE
** 操作** :以保持数据库的性能和查询优化处于最佳状态。VACUUM
回收废弃空间,ANALYZE
更新统计信息,帮助查询规划器生成更优的执行计划。 -
合理创建索引 :根据查询需求,为经常用于过滤和排序的列创建索引。特别是对查询条件中的列(如
confirm_first_time
)和排序列(如data_id
)创建索引,可以显著提高查询性能。 -
分析执行计划 :通过
EXPLAIN ANALYZE
查看查询的执行计划,了解查询的性能瓶颈,并针对性地进行优化。关注操作节点的执行时间、返回的行数、内存使用等关键指标。 -
处理索引创建问题 :在创建索引时,注意权限、表结构冲突、索引创建失败等问题,并采取相应的解决方法。定期维护索引,权衡索引数量,以保持数据库性能和查询效率的平衡。
以上内容涵盖了在 GaussDB(PostgreSQL)中进行数据库开发和优化的关键步骤和技巧,以及在增加索引时可能遇到的问题及解决方案,希望对日常工作有所帮助。