Hive的ORDER BY、SORT BY、DISTRIBUTE BY、CLUSTER BY对比及案例实践

[1. 概述](#1. 概述)

[2. 详细说明](#2. 详细说明)

[2.1 ORDER BY（全局排序）](#2.1 ORDER BY（全局排序）)

[2.2 SORT BY（局部排序）](#2.2 SORT BY（局部排序）)

[2.3 DISTRIBUTE BY（数据分发）](#2.3 DISTRIBUTE BY（数据分发）)

[2.4 DISTRIBUTE BY + SORT BY](#2.4 DISTRIBUTE BY + SORT BY)

[2.5 CLUSTER BY（分区 + 排序的简写）](#2.5 CLUSTER BY（分区 + 排序的简写）)

[3. 详细对比表格](#3. 详细对比表格)

[4. 注意事项与最佳实践](#4. 注意事项与最佳实践)

[5. 完整示例](#5. 完整示例)

创建测试表和数据

[示例 1：ORDER BY（全局排序）](#示例 1：ORDER BY（全局排序）)

[示例 2：SORT BY（局部排序）](#示例 2：SORT BY（局部排序）)

[示例 3：DISTRIBUTE BY（仅分发）](#示例 3：DISTRIBUTE BY（仅分发）)

[示例 4：DISTRIBUTE BY + SORT BY（最灵活，推荐）](#示例 4：DISTRIBUTE BY + SORT BY（最灵活，推荐）)

[示例 5：CLUSTER BY](#示例 5：CLUSTER BY)

[示例 6：优化分区排序之Hive桶表查询](#示例 6：优化分区排序之Hive桶表查询)

1. 概述

Hive 在大规模数据处理中，ORDER BY 用于全局排序，SORT BY 用于局部排序，DISTRIBUTE BY 用于控制数据分发，CLUSTER BY 是 DISTRIBUTE BY 与 SORT BY 的简写形式。

2. 详细说明

2.1 ORDER BY（全局排序）

功能：对整个结果集进行全局有序排序（Global Ordering）。
工作原理 ：所有数据发送到单个 Reducer 完成排序，最终输出一个有序文件。
语法：

md-end-block 复制代码

SELECT ... FROM table_name
ORDER BY col1 [ASC|DESC], col2 [ASC|DESC], ...
[LIMIT n];

特点：全局有序，但只用 1 个 Reducer，数据量大时性能差、易 OOM。
适用场景：小数据量或需要严格全局排序的场景（通常需加 LIMIT）。

2.2 SORT BY（局部排序）

功能：在每个 Reducer 内部进行排序（Local Ordering）。
工作原理：数据分发到多个 Reducer，每个 Reducer 只对自己收到的数据排序。
语法：

md-end-block 复制代码

SELECT ... FROM table_name
SORT BY col1 [ASC|DESC], col2 [ASC|DESC], ...

特点：每个Reducer输出文件内部有序，但文件之间不保证全局有序。
适用场景：大数据量，只需每个文件内部有序即可。

2.3 DISTRIBUTE BY（数据分发）

功能：控制数据如何分发到不同的 Reducer（Hash Partition）。
工作原理 ：相同键进入同一个 Reducer。
语法：

md-end-block 复制代码

SELECT ... FROM table_name
DISTRIBUTE BY col1, col2, ...

特点：只分发、不排序。相同键聚集在同一 Reducer，便于后续处理。
适用场景：需要把相同键数据聚集在一起，常与 SORT BY 组合使用。

2.4 DISTRIBUTE BY + SORT BY

功能：先按指定列分发数据，然后在每个 Reducer 内按指定列（可与分发列不同）进行排序。
工作原理：
- DISTRIBUTE BY 决定数据发到哪个 Reducer（相同键聚集）处理。
- DISTRIBUTE BY 必须写在 SORT BY之前。
语法：

md-end-block 复制代码

SELECT ... FROM table_name
DISTRIBUTE BY col1, col2, ...     -- 分发列
SORT BY colA [ASC|DESC], colB [ASC|DESC], ...;   -- 排序列（可与分发列不同）

特点：
- 相同键在同一 Reducer。
- 每个 Reducer 内部可按不同列排序（最灵活）。
- 不保证全局有序。
适用场景：分区列与排序列不同时（生产中最常用组合）。

2.5 CLUSTER BY（分区 + 排序的简写）

功能：DISTRIBUTE BY + SORT BY 的简写形式（两者列必须相同）。
工作原理：先 DISTRIBUTE BY，再在每个 Reducer 内 SORT BY 同一列。
语法：

md-end-block 复制代码

SELECT ... FROM table_name
CLUSTER BY col1, col2, ...;

特点：
- 相同键聚集在同一 Reducer，且分区内按该列有序。
- 每个输出文件内部有序，但整体不保证全局有序。
适用场景 ：分区列与排序列完全相同时，语法更简洁。

注意：CLUSTER BY col相当于 DISTRIBUTE BY col SORT BY col，不支持分区列与排序列不同的情况。

3. 详细对比表格

子句	排序范围	Reducer 数量	是否全局有序	数据分发特点	性能（大数据量）	常见使用场景
ORDER BY	全局排序	仅 1 个	是	所有数据集中到一个 Reducer	差	小数据、全局严格排序（常加 LIMIT）
SORT BY	每个 Reducer 内排序	多个	否	随机分发数据到不同的Reducer	好	大数据，每个文件内部有序即可
DISTRIBUTE BY	无排序	多个	否	相同键分发到同一 Reducer	好	只需相同键聚集，不需要排序
DISTRIBUTE BY + SORT BY	每个 Reducer 内排序	多个	否	相同键聚集到同一Reducer + 可按不同列排序	好	最灵活场景（分区列可不等于排序列，分区列也可以等于排序列）
CLUSTER BY	每个 Reducer 内排序	多个	否	相同键聚集到同一Reducer + 分区内按同一列有序	好	分区列等于排序列时的简洁写法

4. 注意事项与最佳实践

设置Reducer 数量（只对当前会话有效）：

md-end-block 复制代码

hive (default)> SET mapreduce.job.reduces = 10;   -- 设置 Reducer 个数

Hive Strict Mode：ORDER BY 通常需要加 LIMIT。
性能建议（大数据量）：
- 优先使用 DISTRIBUTE BY + SORT BY 或 CLUSTER BY。
- 避免大表直接使用 ORDER BY。
- DISTRIBUTE BY 列应选择基数适中、分布均匀的列，避免数据倾斜。
与桶表结合：建表时使用 CLUSTERED BY ... SORTED BY ... INTO n BUCKETS，查询时可自动优化。

5. 完整示例

创建测试表和数据

md-end-block 复制代码

CREATE TABLE IF NOT EXISTS emp (
    emp_id   INT,
    emp_name STRING,
    dept_no  INT,
    salary   INT
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';

INSERT INTO emp VALUES
(101,'Alice',10,5000),(102,'Bob',20,6000),(103,'Charlie',10,5500),
(104,'David',30,7000),(105,'Eve',20,6500),(106,'Frank',10,5200),
(107,'Grace',30,7200),(108,'Henry',20,4800),(109,'Ivy',10,5800),
(110,'Jack',30,7100);

操作过程

md-end-block 复制代码

hive (default)> CREATE TABLE IF NOT EXISTS emp (
              >     emp_id   INT,
              >     emp_name STRING,
              >     dept_no  INT,
              >     salary   INT
              > )
              > ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
OK
Time taken: 0.904 seconds
hive (default)> INSERT INTO emp VALUES
              > (101,'Alice',10,5000),(102,'Bob',20,6000),(103,'Charlie',10,5500),
              > (104,'David',30,7000),(105,'Eve',20,6500),(106,'Frank',10,5200),
              > (107,'Grace',30,7200),(108,'Henry',20,4800),(109,'Ivy',10,5800),
              > (110,'Jack',30,7100);
Query ID = liang_20260415154523_09c0f933-0109-41ca-b36d-bd7c03a30525
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1776237580716_0001, Tracking URL = http://node3:8088/proxy/application_1776237580716_0001/
Kill Command = /opt/module/hadoop-3.3.4/bin/mapred job  -kill job_1776237580716_0001
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2026-04-15 15:45:36,487 Stage-1 map = 0%,  reduce = 0%
2026-04-15 15:45:45,762 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 4.21 sec
2026-04-15 15:45:53,962 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 7.15 sec
MapReduce Total cumulative CPU time: 7 seconds 150 msec
Ended Job = job_1776237580716_0001
Stage-4 is selected by condition resolver.
Stage-3 is filtered out by condition resolver.
Stage-5 is filtered out by condition resolver.
Moving data to directory hdfs://node2:8020/user/hive/warehouse/emp/.hive-staging_hive_2026-04-15_15-45-23_180_3061127256930716525-1/-ext-10000
Loading data to table default.emp
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1  Reduce: 1   Cumulative CPU: 7.15 sec   HDFS Read: 18517 HDFS Write: 639 SUCCESS
Total MapReduce CPU Time Spent: 7 seconds 150 msec
OK
col1    col2    col3    col4
Time taken: 33.65 seconds

示例 1：ORDER BY（全局排序）

md-end-block 复制代码

SELECT * FROM emp 
ORDER BY salary DESC 
LIMIT 100;

操作

md-end-block 复制代码

hive (default)> SELECT * FROM emp
              > ORDER BY salary DESC
              > LIMIT 100;
Query ID = liang_20260415155941_0ad0a65f-1325-42cf-9d8f-94c512773b67
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1776237580716_0002, Tracking URL = http://node3:8088/proxy/application_1776237580716_0002/
Kill Command = /opt/module/hadoop-3.3.4/bin/mapred job  -kill job_1776237580716_0002
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2026-04-15 15:59:48,969 Stage-1 map = 0%,  reduce = 0%
2026-04-15 15:59:57,202 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 8.56 sec
2026-04-15 16:00:01,327 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 10.68 sec
MapReduce Total cumulative CPU time: 10 seconds 680 msec
Ended Job = job_1776237580716_0002
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1  Reduce: 1   Cumulative CPU: 10.68 sec   HDFS Read: 11168 HDFS Write: 382 SUCCESS
Total MapReduce CPU Time Spent: 10 seconds 680 msec
OK
emp.emp_id      emp.emp_name    emp.dept_no     emp.salary
107     Grace   30      7200
110     Jack    30      7100
104     David   30      7000
105     Eve     20      6500
102     Bob     20      6000
109     Ivy     10      5800
103     Charlie 10      5500
106     Frank   10      5200
101     Alice   10      5000
108     Henry   20      4800
Time taken: 20.851 seconds, Fetched: 10 row(s)

从执行过程日志中看到只使用1个Reducer，看到salary数据是整体呈现降序排序，说明Order by实现了全局排序。

示例 2：SORT BY（局部排序）

Sort by使用多个Reducer，需要先设置Reducer的数量。Sort by为每个Reducer产生一个排序文件，每个Reducer内部有序，对全局结果集无序。

md-end-block 复制代码

SET mapreduce.job.reduces=3;

INSERT OVERWRITE LOCAL DIRECTORY '/tmp/emp_sortby'
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
SELECT * FROM emp 
SORT BY salary DESC;

操作过程

md-end-block 复制代码

hive (default)> SET mapreduce.job.reduces=3;
hive (default)> INSERT OVERWRITE LOCAL DIRECTORY '/tmp/emp_sortby'
              > ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
              > SELECT * FROM emp
              > SORT BY salary DESC;
Query ID = liang_20260415160535_8b50a22d-be8a-462e-a3b2-5b5fa13455b2
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Defaulting to jobconf value of: 3
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1776237580716_0003, Tracking URL = http://node3:8088/proxy/application_1776237580716_0003/
Kill Command = /opt/module/hadoop-3.3.4/bin/mapred job  -kill job_1776237580716_0003
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 3
2026-04-15 16:05:44,001 Stage-1 map = 0%,  reduce = 0%
2026-04-15 16:05:50,140 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 3.12 sec
2026-04-15 16:05:57,433 Stage-1 map = 100%,  reduce = 33%, Cumulative CPU 6.74 sec
2026-04-15 16:05:58,465 Stage-1 map = 100%,  reduce = 67%, Cumulative CPU 9.66 sec
2026-04-15 16:05:59,485 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 13.7 sec
MapReduce Total cumulative CPU time: 13 seconds 700 msec
Ended Job = job_1776237580716_0003
Moving data to local directory /tmp/emp_sortby
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1  Reduce: 3   Cumulative CPU: 13.7 sec   HDFS Read: 20798 HDFS Write: 175 SUCCESS
Total MapReduce CPU Time Spent: 13 seconds 700 msec
OK
emp.emp_id      emp.emp_name    emp.dept_no     emp.salary
Time taken: 26.687 seconds

从执行过程中看到Sort by使用了3个Reducer。

打开新的终端，查看本地文件

md-end-block 复制代码

[liang@node2 ~]$ ls /tmp/emp_sortby/
000000_0  000001_0  000002_0
[liang@node2 ~]$ cat /tmp/emp_sortby/000000_0
107     Grace   30      7200
110     Jack    30      7100
105     Eve     20      6500
109     Ivy     10      5800
106     Frank   10      5200
108     Henry   20      4800
[liang@node2 ~]$ cat /tmp/emp_sortby/000001_0
104     David   30      7000
102     Bob     20      6000
103     Charlie 10      5500
[liang@node2 ~]$ cat /tmp/emp_sortby/000002_0
101     Alice   10      5000</span></span>

看到/tmp/emp_sortby目录下有3个结果文件，分别查看3个文件，看到的结果数据没有明显规律，说明数据分发到Reducer是随机的（随机分区），导致文件之间可能出现范围重叠（overlapping ranges），从结果中看到3个文件salary值的范围是有重叠的部分。

示例 3：DISTRIBUTE BY（仅分发）

为实现指定数据按规则分发至对应Reducer，如相同部门数据被发送到同一个Reducer，可通过 DISTRIBUTE BY子句实现。

md-end-block 复制代码

SET mapreduce.job.reduces=3;

SELECT * FROM emp 
DISTRIBUTE BY dept_no;

INSERT OVERWRITE LOCAL DIRECTORY '/tmp/emp_distributeby'
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
SELECT * FROM emp 
DISTRIBUTE BY dept_no;

注意：当前会话已配置Reduce数量，无需重复设置；不同会话相互独立，需另行配置。

操作过程

md-end-block 复制代码

hive (default)> SELECT * FROM emp
              > DISTRIBUTE BY dept_no;
Query ID = liang_20260415161624_bf44c56a-740f-4d34-9726-bc5cfe22e2a9
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Defaulting to jobconf value of: 3
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1776237580716_0004, Tracking URL = http://node3:8088/proxy/application_1776237580716_0004/
Kill Command = /opt/module/hadoop-3.3.4/bin/mapred job  -kill job_1776237580716_0004
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 3
2026-04-15 16:16:34,707 Stage-1 map = 0%,  reduce = 0%
2026-04-15 16:16:39,832 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.54 sec
2026-04-15 16:16:47,154 Stage-1 map = 100%,  reduce = 67%, Cumulative CPU 9.85 sec
2026-04-15 16:16:49,203 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 13.45 sec
MapReduce Total cumulative CPU time: 13 seconds 450 msec
Ended Job = job_1776237580716_0004
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1  Reduce: 3   Cumulative CPU: 13.45 sec   HDFS Read: 21847 HDFS Write: 556 SUCCESS
Total MapReduce CPU Time Spent: 13 seconds 450 msec
OK
emp.emp_id      emp.emp_name    emp.dept_no     emp.salary
110     Jack    30      7100
107     Grace   30      7200
104     David   30      7000
109     Ivy     10      5800
106     Frank   10      5200
103     Charlie 10      5500
101     Alice   10      5000
108     Henry   20      4800
105     Eve     20      6500
102     Bob     20      6000
Time taken: 25.719 seconds, Fetched: 10 row(s)

执行以上语句根据dept_no分发数据，看到相同部门数据聚集在一起。为了更清晰地观察结果，可将执行结果输出至本地文件，具体语句如下

md-end-block 复制代码

hive (default)> INSERT OVERWRITE LOCAL DIRECTORY '/tmp/emp_distributeby'
              > ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
              > SELECT * FROM emp
              > DISTRIBUTE BY dept_no;
Query ID = liang_20260415161928_8cec28ff-f008-41d8-a99a-43279f7962f2
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Defaulting to jobconf value of: 3
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1776237580716_0005, Tracking URL = http://node3:8088/proxy/application_1776237580716_0005/
Kill Command = /opt/module/hadoop-3.3.4/bin/mapred job  -kill job_1776237580716_0005
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 3
2026-04-15 16:19:37,306 Stage-1 map = 0%,  reduce = 0%
2026-04-15 16:19:45,561 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 9.71 sec
2026-04-15 16:19:52,948 Stage-1 map = 100%,  reduce = 33%, Cumulative CPU 14.07 sec
2026-04-15 16:19:55,190 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 22.3 sec
MapReduce Total cumulative CPU time: 22 seconds 300 msec
Ended Job = job_1776237580716_0005
Moving data to local directory /tmp/emp_distributeby
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1  Reduce: 3   Cumulative CPU: 22.3 sec   HDFS Read: 20686 HDFS Write: 175 SUCCESS
Total MapReduce CPU Time Spent: 22 seconds 300 msec
OK
emp.emp_id      emp.emp_name    emp.dept_no     emp.salary
Time taken: 28.011 seconds

查看本地文件

md-end-block 复制代码

[liang@node2 ~]$ ls /tmp/emp_distributeby
000000_0  000001_0  000002_0
[liang@node2 ~]$ cat /tmp/emp_distributeby/000000_0
110     Jack    30      7100
107     Grace   30      7200
104     David   30      7000
[liang@node2 ~]$ cat /tmp/emp_distributeby/000001_0
109     Ivy     10      5800
106     Frank   10      5200
103     Charlie 10      5500
101     Alice   10      5000
[liang@node2 ~]$ cat /tmp/emp_distributeby/000002_0
108     Henry   20      4800
105     Eve     20      6500
102     Bob     20      6000

可以看到相同部门的数据聚集到同一个Reducer中。

示例 4：DISTRIBUTE BY + SORT BY（最灵活，推荐）

在前面的案例中，SORT BY 是将数据随机分发到不同 Reducer 后再进行局部排序。若需要按一定规则将数据分发到对应 Reducer（例如将同一部门的数据发往同一个 Reducer），则需搭配 DISTRIBUTE BY 语句。

md-end-block 复制代码

INSERT OVERWRITE LOCAL DIRECTORY '/tmp/emp_dist_sort'
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
SELECT * FROM emp 
DISTRIBUTE BY dept_no
SORT BY salary DESC;

操作过程

md-end-block 复制代码

hive (default)> INSERT OVERWRITE LOCAL DIRECTORY '/tmp/emp_dist_sort'
              > ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
              > SELECT * FROM emp
              > DISTRIBUTE BY dept_no
              > SORT BY salary DESC;
Query ID = liang_20260415164648_21fb1cb1-8dde-4787-a2f6-ee46185a4e4e
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Defaulting to jobconf value of: 3
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1776237580716_0007, Tracking URL = http://node3:8088/proxy/application_1776237580716_0007/
Kill Command = /opt/module/hadoop-3.3.4/bin/mapred job  -kill job_1776237580716_0007
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 3
2026-04-15 16:46:59,263 Stage-1 map = 0%,  reduce = 0%
2026-04-15 16:47:05,614 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 4.29 sec
2026-04-15 16:47:26,917 Stage-1 map = 100%,  reduce = 33%, Cumulative CPU 17.01 sec
2026-04-15 16:47:32,903 Stage-1 map = 100%,  reduce = 67%, Cumulative CPU 31.72 sec
2026-04-15 16:47:34,172 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 47.18 sec
MapReduce Total cumulative CPU time: 47 seconds 180 msec
Ended Job = job_1776237580716_0007
Moving data to local directory /tmp/emp_dist_sort
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1  Reduce: 3   Cumulative CPU: 47.18 sec   HDFS Read: 20899 HDFS Write: 175 SUCCESS
Total MapReduce CPU Time Spent: 47 seconds 180 msec
OK
emp.emp_id      emp.emp_name    emp.dept_no     emp.salary
Time taken: 49.112 seconds

查看本地文件

md-end-block 复制代码

[liang@node2 ~]$ ls /tmp/emp_dist_sort
000000_0  000001_0  000002_0
[liang@node2 ~]$ cat /tmp/emp_dist_sort/000000_0
107     Grace   30      7200
110     Jack    30      7100
104     David   30      7000
[liang@node2 ~]$ cat /tmp/emp_dist_sort/000001_0
109     Ivy     10      5800
103     Charlie 10      5500
106     Frank   10      5200
101     Alice   10      5000
[liang@node2 ~]$ cat /tmp/emp_dist_sort/000002_0
105     Eve     20      6500
102     Bob     20      6000
108     Henry   20      4800

看到相同部分数据发往同一个Reducer，同时每个Reducer内部按工资降序排序。

思考：如果部门数量大于分区数量应该发如何分发？

追加2条40部门的数据到emp表，让部门数量大于分区数量

md-end-block 复制代码

INSERT INTO emp VALUES (111, 'Tom', 40, 6600),(112, 'Uma', 40, 7200);

操作过程

md-end-block 复制代码

hive (default)> INSERT INTO emp VALUES (111, 'Tom', 40, 6600),(112, 'Uma', 40, 7200);
Query ID = liang_20260415165642_37c34831-562c-4eab-9b9c-1f94e0d49f38
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1776237580716_0008, Tracking URL = http://node3:8088/proxy/application_1776237580716_0008/
Kill Command = /opt/module/hadoop-3.3.4/bin/mapred job  -kill job_1776237580716_0008
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2026-04-15 16:56:50,065 Stage-1 map = 0%,  reduce = 0%
2026-04-15 16:56:58,254 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 3.98 sec
2026-04-15 16:57:04,449 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 7.23 sec
MapReduce Total cumulative CPU time: 7 seconds 230 msec
Ended Job = job_1776237580716_0008
Stage-4 is selected by condition resolver.
Stage-3 is filtered out by condition resolver.
Stage-5 is filtered out by condition resolver.
Moving data to directory hdfs://node2:8020/user/hive/warehouse/emp/.hive-staging_hive_2026-04-15_16-56-42_773_2473216972157260403-1/-ext-10000
Loading data to table default.emp
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1  Reduce: 1   Cumulative CPU: 7.23 sec   HDFS Read: 17674 HDFS Write: 351 SUCCESS
Total MapReduce CPU Time Spent: 7 seconds 230 msec
OK
col1    col2    col3    col4
Time taken: 22.992 seconds

再次查看数据，看到多了40部门的数据

md-end-block 复制代码

hive (default)> select * from emp;
OK
emp.emp_id      emp.emp_name    emp.dept_no     emp.salary
101     Alice   10      5000
102     Bob     20      6000
103     Charlie 10      5500
104     David   30      7000
105     Eve     20      6500
106     Frank   10      5200
107     Grace   30      7200
108     Henry   20      4800
109     Ivy     10      5800
110     Jack    30      7100
111     Tom     40      6600
112     Uma     40      7200
Time taken: 0.108 seconds, Fetched: 12 row(s)

现有4个部门，使用3个Reducer来排序，部门数量大于Reducer数量，再次执行如下语句

md-end-block 复制代码

INSERT OVERWRITE LOCAL DIRECTORY '/tmp/emp_dist_sort'
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
SELECT * FROM emp 
DISTRIBUTE BY dept_no
SORT BY salary DESC;

操作过程

md-end-block 复制代码

hive (default)> INSERT OVERWRITE LOCAL DIRECTORY '/tmp/emp_dist_sort'
              > ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
              > SELECT * FROM emp
              > DISTRIBUTE BY dept_no
              > SORT BY salary DESC;
Query ID = liang_20260415170033_8455f9cc-eb85-4d0b-bb0f-ddfc5c8a5f0f
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Defaulting to jobconf value of: 3
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1776237580716_0009, Tracking URL = http://node3:8088/proxy/application_1776237580716_0009/
Kill Command = /opt/module/hadoop-3.3.4/bin/mapred job  -kill job_1776237580716_0009
Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 3
2026-04-15 17:00:55,663 Stage-1 map = 0%,  reduce = 0%
2026-04-15 17:01:16,073 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 31.41 sec
2026-04-15 17:01:29,635 Stage-1 map = 100%,  reduce = 33%, Cumulative CPU 42.15 sec
2026-04-15 17:01:34,016 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 59.35 sec
MapReduce Total cumulative CPU time: 59 seconds 350 msec
Ended Job = job_1776237580716_0009
Moving data to local directory /tmp/emp_dist_sort
MapReduce Jobs Launched:
Stage-Stage-1: Map: 2  Reduce: 3   Cumulative CPU: 59.35 sec   HDFS Read: 26274 HDFS Write: 207 SUCCESS
Total MapReduce CPU Time Spent: 59 seconds 350 msec
OK
emp.emp_id      emp.emp_name    emp.dept_no     emp.salary
Time taken: 62.46 seconds

查看本地文件

md-end-block 复制代码

[liang@node2 ~]$ ls /tmp/emp_dist_sort
000000_0  000001_0  000002_0
[liang@node2 ~]$ cat /tmp/emp_dist_sort/000000_0
107     Grace   30      7200
110     Jack    30      7100
104     David   30      7000
[liang@node2 ~]$ cat /tmp/emp_dist_sort/000001_0
112     Uma     40      7200
111     Tom     40      6600
109     Ivy     10      5800
103     Charlie 10      5500
106     Frank   10      5200
101     Alice   10      5000
[liang@node2 ~]$ cat /tmp/emp_dist_sort/000002_0
105     Eve     20      6500
102     Bob     20      6000
108     Henry   20      4800

看到40部门和10部门在同一Reducer中，原因是dept_no进行hash之后再与Reducer个数求余（例如：30/3余0，10/3和40/3都余1，20/3余2），余数相同的部门进入相同的Reducer中。

示例 5：CLUSTER BY

当分区列和排序列相同时，可以简写为Cluster by，例如：DISTRIBUTE BY dept_no SORT BY dept_no 等价于 CLUSTER BY dept_no

md-end-block 复制代码

SELECT * FROM emp DISTRIBUTE BY dept_no SORT BY dept_no; 
-- 简写如下
SELECT * FROM emp CLUSTER BY dept_no;

操作过程

md-end-block 复制代码

hive (default)> SELECT * FROM emp DISTRIBUTE BY dept_no SORT BY dept_no;
Query ID = liang_20260415171134_1ce45cb8-9792-4873-978d-9f9f48ef01e1
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Defaulting to jobconf value of: 3
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1776237580716_0010, Tracking URL = http://node3:8088/proxy/application_1776237580716_0010/
Kill Command = /opt/module/hadoop-3.3.4/bin/mapred job  -kill job_1776237580716_0010
Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 3
2026-04-15 17:11:42,393 Stage-1 map = 0%,  reduce = 0%
2026-04-15 17:11:51,690 Stage-1 map = 50%,  reduce = 0%, Cumulative CPU 4.51 sec
2026-04-15 17:11:52,713 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 11.15 sec
2026-04-15 17:11:58,983 Stage-1 map = 100%,  reduce = 67%, Cumulative CPU 19.09 sec
2026-04-15 17:12:04,100 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 22.81 sec
MapReduce Total cumulative CPU time: 22 seconds 810 msec
Ended Job = job_1776237580716_0010
MapReduce Jobs Launched:
Stage-Stage-1: Map: 2  Reduce: 3   Cumulative CPU: 22.81 sec   HDFS Read: 27655 HDFS Write: 612 SUCCESS
Total MapReduce CPU Time Spent: 22 seconds 810 msec
OK
emp.emp_id      emp.emp_name    emp.dept_no     emp.salary
110     Jack    30      7100
107     Grace   30      7200
104     David   30      7000
109     Ivy     10      5800
106     Frank   10      5200
103     Charlie 10      5500
101     Alice   10      5000
112     Uma     40      7200
111     Tom     40      6600
108     Henry   20      4800
105     Eve     20      6500
102     Bob     20      6000
Time taken: 31.976 seconds, Fetched: 12 row(s)


hive (default)> SELECT * FROM emp CLUSTER BY dept_no;
Query ID = liang_20260415171259_398312a6-2819-48f0-be71-b129cc6c4a8d
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Defaulting to jobconf value of: 3
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1776237580716_0011, Tracking URL = http://node3:8088/proxy/application_1776237580716_0011/
Kill Command = /opt/module/hadoop-3.3.4/bin/mapred job  -kill job_1776237580716_0011
Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 3
2026-04-15 17:13:08,095 Stage-1 map = 0%,  reduce = 0%
2026-04-15 17:13:34,379 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 18.44 sec
2026-04-15 17:13:50,464 Stage-1 map = 100%,  reduce = 33%, Cumulative CPU 30.99 sec
2026-04-15 17:13:53,980 Stage-1 map = 100%,  reduce = 67%, Cumulative CPU 41.73 sec
2026-04-15 17:13:56,370 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 52.54 sec
MapReduce Total cumulative CPU time: 52 seconds 540 msec
Ended Job = job_1776237580716_0011
MapReduce Jobs Launched:
Stage-Stage-1: Map: 2  Reduce: 3   Cumulative CPU: 52.54 sec   HDFS Read: 27655 HDFS Write: 612 SUCCESS
Total MapReduce CPU Time Spent: 52 seconds 540 msec
OK
emp.emp_id      emp.emp_name    emp.dept_no     emp.salary
110     Jack    30      7100
107     Grace   30      7200
104     David   30      7000
109     Ivy     10      5800
106     Frank   10      5200
103     Charlie 10      5500
101     Alice   10      5000
112     Uma     40      7200
111     Tom     40      6600
108     Henry   20      4800
105     Eve     20      6500
102     Bob     20      6000
Time taken: 59.58 seconds, Fetched: 12 row(s)

可以看到两个语句执行结果一致。

示例 6：优化分区排序之Hive桶表查询

在建桶表时，通过 CLUSTERED BY + SORTED BY + INTO n BUCKETS 定义数据的物理存储结构，后续查询时 Hive 可以自动利用桶的特性进行优化，且无需每次查询都手动写 DISTRIBUTE BY / SORT BY / CLUSTER BY。

1. 创建分桶表

创建分桶表：按部门分桶，每个桶内按薪资降序，共 3 个桶

md-end-block 复制代码

CREATE TABLE emp_bucket (
    emp_id   INT,
    emp_name STRING,
    dept_no  INT,
    salary   INT
)
CLUSTERED BY (dept_no)
SORTED BY (salary DESC)
INTO 3 BUCKETS
ROW FORMAT DELIMITED 
FIELDS TERMINATED BY ',';

CLUSTERED BY (dept_no)：分桶字段（决定数据哈希分发，类似 DISTRIBUTE BY）

SORTED BY (salary DESC) ：桶内排序字段（每个桶的文件内部有序，类似 SORT BY）

操作过程

md-end-block 复制代码

hive (default)> CREATE TABLE emp_bucket (
              >     emp_id   INT,
              >     emp_name STRING,
              >     dept_no  INT,
              >     salary   INT
              > )
              > CLUSTERED BY (dept_no)
              > SORTED BY (salary DESC)
              > INTO 3 BUCKETS
              > ROW FORMAT DELIMITED
              > FIELDS TERMINATED BY ',';
OK
Time taken: 0.107 seconds

2. 插入数据到桶表

md-end-block 复制代码

-- 开启 Hive 自动分桶
SET hive.enforce.bucketing = true;  

-- 直接从 emp 表导入数据，Hive 自动分桶、自动排序
INSERT OVERWRITE TABLE emp_bucket
SELECT * FROM emp;

操作过程

md-end-block 复制代码

hive (default)> INSERT OVERWRITE TABLE emp_bucket
              > SELECT * FROM emp;
Query ID = liang_20260415180744_8392f904-1ccb-4816-bf7c-baa70eb7aca7
Total jobs = 2
Launching Job 1 out of 2
Number of reduce tasks determined at compile time: 3
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1776237580716_0012, Tracking URL = http://node3:8088/proxy/application_1776237580716_0012/
Kill Command = /opt/module/hadoop-3.3.4/bin/mapred job  -kill job_1776237580716_0012
Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 3
2026-04-15 18:07:55,139 Stage-1 map = 0%,  reduce = 0%
2026-04-15 18:08:05,819 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 21.48 sec
2026-04-15 18:08:15,323 Stage-1 map = 100%,  reduce = 33%, Cumulative CPU 29.92 sec
2026-04-15 18:08:18,489 Stage-1 map = 100%,  reduce = 67%, Cumulative CPU 36.49 sec
2026-04-15 18:08:23,756 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 45.09 sec
MapReduce Total cumulative CPU time: 45 seconds 90 msec
Ended Job = job_1776237580716_0012
Loading data to table default.emp_bucket
Launching Job 2 out of 2
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1776237580716_0013, Tracking URL = http://node3:8088/proxy/application_1776237580716_0013/
Kill Command = /opt/module/hadoop-3.3.4/bin/mapred job  -kill job_1776237580716_0013
Hadoop job information for Stage-3: number of mappers: 3; number of reducers: 1
2026-04-15 18:08:37,853 Stage-3 map = 0%,  reduce = 0%
2026-04-15 18:08:52,233 Stage-3 map = 33%,  reduce = 0%, Cumulative CPU 15.02 sec
2026-04-15 18:09:07,485 Stage-3 map = 67%,  reduce = 0%, Cumulative CPU 34.73 sec
2026-04-15 18:09:10,082 Stage-3 map = 100%,  reduce = 0%, Cumulative CPU 57.21 sec
2026-04-15 18:09:17,782 Stage-3 map = 100%,  reduce = 100%, Cumulative CPU 64.87 sec
MapReduce Total cumulative CPU time: 1 minutes 4 seconds 870 msec
Ended Job = job_1776237580716_0013
MapReduce Jobs Launched:
Stage-Stage-1: Map: 2  Reduce: 3   Cumulative CPU: 45.09 sec   HDFS Read: 35722 HDFS Write: 1214 SUCCESS
Stage-Stage-3: Map: 3  Reduce: 1   Cumulative CPU: 64.87 sec   HDFS Read: 23656 HDFS Write: 437 SUCCESS
Total MapReduce CPU Time Spent: 1 minutes 49 seconds 960 msec
OK
emp.emp_id      emp.emp_name    emp.dept_no     emp.salary
Time taken: 95.327 seconds

3. 查询验证（体现自动优化）

md-end-block 复制代码

-- 直接查询，1.语法简化：不需要写任何 DISTRIBUTE BY / SORT BY / CLUSTER BY 2.直接抓取数据，减少执行mapreduce程序开销
SELECT * FROM emp_bucket;

--查询桶表结果等价于如下语句
SELECT * FROM emp DISTRIBUTE BY dept_no SORT BY salary DESC;

操作过程

md-end-block 复制代码

hive (default)> SELECT * FROM emp_bucket;
OK
emp_bucket.emp_id       emp_bucket.emp_name     emp_bucket.dept_no      emp_bucket.salary
107     Grace   30      7200
110     Jack    30      7100
104     David   30      7000
112     Uma     40      7200
111     Tom     40      6600
109     Ivy     10      5800
103     Charlie 10      5500
106     Frank   10      5200
101     Alice   10      5000
105     Eve     20      6500
102     Bob     20      6000
108     Henry   20      4800
Time taken: 0.137 seconds, Fetched: 12 row(s)

查询结果等价于如下语句

md-end-block 复制代码

hive (default)> SELECT * FROM emp
              > DISTRIBUTE BY dept_no
              > SORT BY salary DESC;
Query ID = liang_20260415182308_850726c5-15f9-4c76-a56d-9f3dc9b34f0a
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Defaulting to jobconf value of: 3
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1776237580716_0014, Tracking URL = http://node3:8088/proxy/application_1776237580716_0014/
Kill Command = /opt/module/hadoop-3.3.4/bin/mapred job  -kill job_1776237580716_0014
Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 3
2026-04-15 18:23:30,355 Stage-1 map = 0%,  reduce = 0%
2026-04-15 18:24:05,044 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 38.53 sec
2026-04-15 18:24:33,589 Stage-1 map = 100%,  reduce = 33%, Cumulative CPU 51.11 sec
2026-04-15 18:24:38,316 Stage-1 map = 100%,  reduce = 67%, Cumulative CPU 61.51 sec
2026-04-15 18:24:40,630 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 73.33 sec
MapReduce Total cumulative CPU time: 1 minutes 13 seconds 330 msec
Ended Job = job_1776237580716_0014
MapReduce Jobs Launched:
Stage-Stage-1: Map: 2  Reduce: 3   Cumulative CPU: 73.33 sec   HDFS Read: 27655 HDFS Write: 612 SUCCESS
Total MapReduce CPU Time Spent: 1 minutes 13 seconds 330 msec
OK
emp.emp_id      emp.emp_name    emp.dept_no     emp.salary
107     Grace   30      7200
110     Jack    30      7100
104     David   30      7000
112     Uma     40      7200
111     Tom     40      6600
109     Ivy     10      5800
103     Charlie 10      5500
106     Frank   10      5200
101     Alice   10      5000
105     Eve     20      6500
102     Bob     20      6000
108     Henry   20      4800
Time taken: 95.508 seconds, Fetched: 12 row(s)

桶表方式自动优化效果：

Hive 知道表是按 dept_no 分桶的，因此相同部门的数据物理上已经聚集在同一个桶文件 中。每个桶文件内部数据已经按salary DESC 排序，查询时可直接利用该有序结构，直接抓取数据，不用执行MapReduce程序，减少运行时排序开销。

md-end-block 复制代码

[liang@node2 ~]$ hdfs dfs -ls /user/hive/warehouse/emp_bucket
Found 3 items
-rw-r--r--   1 liang supergroup         53 2026-04-15 18:08 /user/hive/warehouse/emp_bucket/000000_0
-rw-r--r--   1 liang supergroup        104 2026-04-15 18:08 /user/hive/warehouse/emp_bucket/000001_0
-rw-r--r--   1 liang supergroup         50 2026-04-15 18:08 /user/hive/warehouse/emp_bucket/000002_0
[liang@node2 ~]$ hdfs dfs -cat /user/hive/warehouse/emp_bucket/000000_0
107,Grace,30,7200
110,Jack,30,7100
104,David,30,7000
[liang@node2 ~]$ hdfs dfs -cat /user/hive/warehouse/emp_bucket/000001_0
112,Uma,40,7200
111,Tom,40,6600
109,Ivy,10,5800
103,Charlie,10,5500
106,Frank,10,5200
101,Alice,10,5000
[liang@node2 ~]$ hdfs dfs -cat /user/hive/warehouse/emp_bucket/000002_0
105,Eve,20,6500
102,Bob,20,6000
108,Henry,20,4800

在 Join 等场景中，如果另一张表也按相同列分相同数量（或倍数）的桶，可自动触发 Map-side Join，大幅提升性能。

完成！enjoy it！