1、建库
bash
hive> create database `scott`;
OK
Time taken: 0.281 seconds
hive> show databases;
OK
default
scott
Time taken: 0.02 seconds, Fetched: 2 row(s)
hive> use scott;
OK
Time taken: 0.045 seconds
hive> show tables;
OK
Time taken: 0.041 seconds
hive>
2、建表
bash
create table if not exists scott.emp(
empno int,
ename string,
job string,
mgr int,
hiredate date,
sal double,
comm double,
deptno int
)
comment 'scott' //表的备注
row format delimited fields terminated by '\t' //指定列分隔符'\t'
lines terminated by '\n' //指定行分割符
stored as textfile //存储的文件格式
location '/user/hive/warehouse/scott.db/emp'; //制定数据库创建的目录
hive> create table if not exists scott.emp(
> empno int,
> ename string,
> job string,
> mgr int,
> hiredate date,
> sal double,
> comm double,
> deptno int
> )
> comment 'scott'
> row format delimited fields terminated by '\t'
> lines terminated by '\n'
> stored as textfile
> location '/user/hive/warehouse/scott.db/emp';
OK
Time taken: 0.241 seconds
hive> create table if not exists scott.dept(
> deptno int,
> dname string,
> loc string
> )
> comment 'scott.dept'
> row format delimited fields terminated by '\t'
> lines terminated by '\n'
> stored as textfile
> location '/user/hive/warehouse/scott.db/dept';
OK
Time taken: 0.09 seconds
hive> show tables;
OK
dept
emp
Time taken: 0.019 seconds, Fetched: 2 row(s)
3、上传数据
本地上传
bash
load data local inpath '/tmp/emp.csv' into table scott.emp;
load data local inpath '/tmp/dept.csv' into table scott.dept;
bash
hive> load data local inpath '/tmp/emp.csv' into table scott.emp;
Loading data to table scott.emp
OK
Time taken: 1.369 seconds
hive> load data local inpath '/tmp/dept.csv' into table scott.dept;
Loading data to table scott.dept
OK
Time taken: 0.621 seconds
4、查询
bash
hive> select * from emp e , dept d where e.deptno = d.deptno;
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the
future versions. Consider using a different execution engine (i.e. spark, tez) or
using Hive 1.X releases.
Query ID = root_20260325152412_91d4c949-38d4-403b-b2d7-20bd9bf29f5c
Total jobs = 1
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/bigdata/hive-2.1.1/lib/log4j-slf4j-impl-
2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/bigdata/hadoop-
2.7.1/share/hadoop/common/lib/slf4j-log4j12-
1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
2026-03-25 15:24:19 Starting to launch local task to process map join;
maximum memory = 477626368
2026-03-25 15:24:20 Dump the side-table for tag: 1 with group count: 4 into
file: file:/opt/bigdata/hive-2.1.1/temp/root/daa2520c-4797-47ab-b0bd-
e027ebf2e63c/hive_2026-03-25_15-24-12_809_1603657098055148777-1/-local-
10004/HashTable-Stage-3/MapJoin-mapfile01--.hashtable
2026-03-25 15:24:20 Uploaded 1 File to: file:/opt/bigdata/hive-
2.1.1/temp/root/daa2520c-4797-47ab-b0bd-e027ebf2e63c/hive_2026-03-25_15-24-
12_809_1603657098055148777-1/-local-10004/HashTable-Stage-3/MapJoin-mapfile01-
-.hashtable (404 bytes)
2026-03-25 15:24:20 End of local task; Time Taken: 1.559 sec.
Execution completed successfully
MapredLocal task succeeded
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1774421907259_0001, Tracking URL =
http://c001:8099/proxy/application_1774421907259_0001/
Kill Command = /opt/bigdata/hadoop-2.7.1/bin/hadoop job -kill
job_1774421907259_0001
Hadoop job information for Stage-3: number of mappers: 1; number of reducers: 0
2026-03-25 15:24:55,911 Stage-3 map = 0%, reduce = 0%
2026-03-25 15:25:04,638 Stage-3 map = 100%, reduce = 0%, Cumulative CPU 2.89 sec
MapReduce Total cumulative CPU time: 2 seconds 890 msec
Ended Job = job_1774421907259_0001
MapReduce Jobs Launched:
Stage-Stage-3: Map: 1 Cumulative CPU: 2.89 sec HDFS Read: 8648 HDFS Write:
1198 SUCCESS
Total MapReduce CPU Time Spent: 2 seconds 890 msec
OK
7369 SMITH CLERK 7902 1980-12-17 800.0 NULL 20 20
RESEARCH DALLAS
7499 ALLEN SALESMAN 7698 1981-02-20 1600.0 300.0 30
30 SALES CHICAGO
7521 WARD SALESMAN 7698 1981-02-22 1250.0 500.0 30
30 SALES CHICAGO
7566 JONES MANAGER 7839 1981-04-02 2975.0 NULL 20 20
RESEARCH DALLAS
7654 MARTIN SALESMAN 7698 1981-09-28 1250.0 1400.0 30
30 SALES CHICAGO
7698 BLAKE MANAGER 7839 1981-05-01 2850.0 NULL 30 30
SALES CHICAGO
7782 CLARK MANAGER 7839 1981-06-09 2450.0 NULL 10 10
ACCOUNTING NEW YORK
7788 SCOTT ANALYST 7566 1987-07-13 3000.0 NULL 20 20
RESEARCH DALLAS
7839 KING PRESIDENT NULL 1981-11-17 5000.0 NULL 10
10 ACCOUNTING NEW YORK
7844 TURNER SALESMAN 7698 1981-09-08 1100.0 0.0 30
30 SALES CHICAGO
7876 ADAMS CLERK 7788 1987-07-13 1100.0 NULL 20 20
RESEARCH DALLAS
7900 JAMES CLERK 7698 1981-12-03 950.0 NULL 30 30
SALES CHICAGO
7902 FORD ANALYST 7566 1981-12-03 3000.0 NULL 20 20
RESEARCH DALLAS
7934 MILLER CLERK 7782 1982-01-23 1300.0 NULL 10 10
ACCOUNTING NEW YORK
Time taken: 54.091 seconds, Fetched: 14 row(s)
hive> select max(sal) maxsal , min(sal) minsal from emp group by deptno;
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the
future versions. Consider using a different execution engine (i.e. spark, tez) or
using Hive 1.X releases.
Query ID = root_20260325152658_622daa51-1a5d-423f-856d-a58f541fc784
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
Starting Job = job_1774421907259_0002, Tracking URL =
http://c001:8099/proxy/application_1774421907259_0002/
Kill Command = /opt/bigdata/hadoop-2.7.1/bin/hadoop job -kill
job_1774421907259_0002
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2026-03-25 15:27:29,539 Stage-1 map = 0%, reduce = 0%
2026-03-25 15:27:37,570 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.74 sec
2026-03-25 15:27:44,254 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 3.64
sec
MapReduce Total cumulative CPU time: 3 seconds 640 msec
Ended Job = job_1774421907259_0002
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 3.64 sec HDFS Read: 10832
HDFS Write: 163 SUCCESS
Total MapReduce CPU Time Spent: 3 seconds 640 msec
OK
5000.0 1300.0
3000.0 800.0
2850.0 950.0
Time taken: 47.891 seconds, Fetched: 3 row(s)
5、新增
bash
)
hive> insert into scott.dept (deptno,dname,loc) values (50,"gong guan","tsrj");
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the
future versions. Consider using a different execution engine (i.e. spark, tez) or
using Hive 1.X releases.
Query ID = root_20260325153147_27aedfad-6e1e-4c82-b961-16baf00fde88
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1774421907259_0003, Tracking URL =
http://c001:8099/proxy/application_1774421907259_0003/
Kill Command = /opt/bigdata/hadoop-2.7.1/bin/hadoop job -kill
job_1774421907259_0003
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2026-03-25 15:32:08,785 Stage-1 map = 0%, reduce = 0%
2026-03-25 15:32:15,979 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.26 sec
MapReduce Total cumulative CPU time: 2 seconds 260 msec
Ended Job = job_1774421907259_0003
Stage-4 is selected by condition resolver.
Stage-3 is filtered out by condition resolver.
Stage-5 is filtered out by condition resolver.
Moving data to directory
hdfs://192.168.100.101:9000/user/hive/warehouse/scott.db/dept/.hive-
staging_hive_2026-03-25_15-31-47_882_4932928533857558034-1/-ext-10000
Loading data to table scott.dept
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 Cumulative CPU: 2.26 sec HDFS Read: 4324 HDFS Write: 84
SUCCESS
Total MapReduce CPU Time Spent: 2 seconds 260 msec
OK
Time taken: 30.66 seconds
hive> select * from dept;
OK
50 gong guan tsrj
10 ACCOUNTING NEW YORK
20 RESEARCH DALLAS
30 SALES CHICAGO
40 OPERATIONS BOSTON
Time taken: 0.143 seconds, Fetched: 5 row(s)
hive可以新增数据,但是不可以删除和修改,部分sql语句和复杂的子查询同样不支持。
注意:
insert语句,每执行一次相当于增加一个文件
所有相同行为,可以通过hive上传文件(推荐)或者直接将输入放入hdfs的hive目录中,两张
方式任选。