Hive 行列转换

行列转换
列转行

使用 lateral view + explode(array|map)lateral view + inline(array_struct) 可以将列转换为行。

  • 单列转多行,降维(单列数组或键值对)

示例1:explode(array(...))

sql 复制代码
select ..., A
from T
lateral view explode(ARRAY_FIELD) as A;
sql 复制代码
select explode(`array`(88.2,98.3,67.1)) AS (price);

示例2:explode(map(...))

sql 复制代码
select ..., K, V
from T
lateral view explode(MAP_FIELD) as K, V;
sql 复制代码
select explode(`map`("java",56,"mysql",88,"javascript",66)) AS (subject, score);

示例3:inline(array_struct)

sql 复制代码
select ..., 
from T
lateral view inline(STRUCT_ARRAY_FIELD)V as F1,...,FN;
sql 复制代码
with tmp as (
select array(
	named_struct('name','henry','age',22,'is_member','true'),
	named_struct('name','pola','age',20,'is_member','true'),
	named_struct('name','ariel','age',19,'is_member','true')
   ) AS array_struct
)
select name,age,is_member
from tmp
lateral view inline(array_struct)V as name,age,is_member;

lateral view inline(array_struct)将结构体数组的每个元素都转化为一行,每一行都包含结构体字段的值.

前:

后:

  • 多列转多行
sql 复制代码
select ..., A
from T
lateral view explode(array|map(F1,...,FN))V as A;

示例:

sql 复制代码
SELECT name, class, Scores.subject, Scores.score
FROM Students
LATERAL VIEW EXPLODE(ARRAY(
	named_struct('subject','math','score',math_score),
	named_struct('subject','science','score',science_score)
	)
) V AS Scores;

前:

后:

行转列
  • 多行转多列
    条件聚合,通常用于将多行数据中满足条件的某个值聚合到单个行中。
sql 复制代码
select
		F1,...,
		sum(if(C1,0,V1)) as A1,
		sum(if(C2,0,V2)) as A2,
		sum(if(C3,0,V3)) as A3
	from TABLE_NAME
	group by F1,...
	
	drop table if exists lateral_view_stack_test1w;
	create table lateral_view_stack_test1w as
	select year,
		   sum(if(month(order_time)=1,order_amount,0)) as sum_jan,
		   sum(if(month(order_time)=2,order_amount,0)) as sum_feb,
		   sum(if(month(order_time)=3,order_amount,0)) as sum_mar,
		   sum(if(month(order_time)=4,order_amount,0)) as sum_apr,
		   sum(if(month(order_time)=5,order_amount,0)) as sum_may,
		   sum(if(month(order_time)=6,order_amount,0)) as sum_jun,
		   sum(if(month(order_time)=7,order_amount,0)) as sum_jul,
		   sum(if(month(order_time)=8,order_amount,0)) as sum_aug,
		   sum(if(month(order_time)=9,order_amount,0)) as sum_sep,
		   sum(if(month(order_time)=10,order_amount,0)) as sum_oct,
		   sum(if(month(order_time)=11,order_amount,0)) as sum_nov,
		   sum(if(month(order_time)=12,order_amount,0)) as sum_dec
	from hive_internal_par_regex_test1w
	where year>=2014
	group by year;
相关推荐
yumgpkpm10 小时前
CMP(类Cloudera CDP 7.3 404版华为Kunpeng)与其他大数据平台对比
大数据·hive·hadoop·elasticsearch·kafka·hbase·cloudera
piepis1 天前
Doris Docker 完整部署指南
数据仓库·docker·doris·容器部署
yumgpkpm1 天前
Hadoop大数据平台在中国AI时代的后续发展趋势研究CMP(类Cloudera CDP 7.3 404版华为鲲鹏Kunpeng)
大数据·hive·hadoop·python·zookeeper·oracle·cloudera
KANGBboy2 天前
ES 总结
hive·elasticsearch
FeelTouch Labs2 天前
数据仓库和数据集市之ODS、CDM、ADS、DWD、DWS
数据仓库
TTBIGDATA3 天前
【Ambari开启Kerberos】Step1-KDC服务初始化安装-适合Ubuntu
运维·数据仓库·hadoop·ubuntu·ambari·hdp·bigtop
派大星爱吃猫3 天前
C++中的inline函数(内联函数)
c++·inline·内联函数
码·蚁4 天前
SpringMVC
数据仓库·hive·hadoop
2021_fc4 天前
StarRocks技术分享
数据仓库
yumgpkpm5 天前
CMP(类Cloudera CDP 7.3 404版华为泰山Kunpeng)和Apache Doris的对比
大数据·hive·hadoop·spark·apache·hbase·cloudera