hive中map相关函数总结

目录

hive官方函数解释

hive官网函数大全地址:hive官网函数大全地址

Return Type Name Description
map map(key1, value1, key2, value2, ...) Creates a map with the given key/value pairs.
array map_values(Map<K.V>) Returns an unordered array containing the values of the input map.
array map_keys(Map<K.V>) Returns an unordered array containing the keys of the input map.
map<string,string> str_to_map(text[, delimiter1, delimiter2]) Splits text into key-value pairs using two delimiters. Delimiter1 separates text into K-V pairs, and Delimiter2 splits each K-V pair. Default delimiters are ',' for delimiter1 and ':' for delimiter2.
Tkey,Tvalue explode(MAP<Tkey,Tvalue> m) Explodes a map to multiple rows. Returns a row-set with a two columns (key,value) , one row for each key-value pair from the input map. (As of Hive 0.8.0.).

示例

1、map(key1, value1, key2, value2, ...)

sql 复制代码
SELECT map('name', '张三', 'age', 20, 'gender', '男') AS student;
---结果:
student	
{"age":"20","gender":"男","name":"张三"}

2、map_values(Map<K.V>)

sql 复制代码
SELECT map_keys(map('name', '张三', 'age', 20, 'gender', '男')) AS keys;
---结果:
keys
["name","age","gender"]

3、map_values(Map<K.V>)

sql 复制代码
SELECT map_values(map('name', '张三', 'age', 20, 'gender', '男')) AS values;
---结果:
values	
["张三","20","男"]

4、str_to_map(str, delimiter1, delimiter2)

str_to_map 函数用于将一个字符串转换为 Map 对象。具体来说,str_to_map 函数会将一个由键值对组成的字符串解析成一个 Map 对象,其中键和值之间使用指定的分隔符进行分隔。其中,str 是要转换的字符串,delimiter1 是键值对之间的分隔符,delimiter2 是键和值之间的分隔符。默认情况下,delimiter1 的值是 ',',delimiter2 的值是 ':'。

sql 复制代码
SELECT str_to_map('name:张三,age:20,gender:男', ',', ':') AS student;
---结果:
student	
{"age":"20","gender":"男","name":"张三"}

SELECT str_to_map('name=张三,age=20,gender=男', ',', '=') AS student;
---结果:
student	
{"age":"20","gender":"男","name":"张三"}

5、explode (map)

sql 复制代码
select explode(map('A',10,'B',20,'C',30));
select explode(map('A',10,'B',20,'C',30)) as (key,value);
select tf.* from (select 0) t lateral view explode(map('A',10,'B',20,'C',30)) tf;
select tf.* from (select 0) t lateral view explode(map('A',10,'B',20,'C',30)) tf as key,value;
---上述四个结果均为:
key     value
A       10	
B       20	
C       30

实战

给出一组学生数据,有名字,课程,等级,分数等字段,现在求每门课的情况,包含平均成绩,及这门课包含哪些学生及学生的等级

sql 复制代码
with stud as
( select  'zhang3' as name ,'优' as grade  ,'math' as course ,'88' as score  
  union all 
  select  'li4' as name ,'良' as grade  ,'math' as course ,'72' as score
    union all 
  select  'zhao6' as name ,'差' as grade  ,'math' as course ,'44' as score
    union all 
  select  'wang5' as name ,'优' as grade  ,'chinese' as course ,'80' as score
    union all 
  select  'zhao6' as name ,'优' as grade  ,'chinese' as course ,'55' as score
    union all 
  select  'tian7' as name ,'优' as grade  ,'chinese' as course ,'75' as score
)

--sql1
select course, collect_set(concat(name,':',grade)) as collect , avg(score) from stud group by course;
---结果:
course             collect                                             avg(score)	
math        ["li4:良","zhao6:差","zhang3:优"]                           68.0
chinese     ["wang5:优","tian7:优","zhao6:优"]                          70.0
----sql2
select course, concat_ws(',',collect_set(concat(name,':',grade))) as strings , avg(score) from stud group by course;
---结果:
course                      strings                                        avg(score)
math             li4:良,zhao6:差,zhang3:优                                  68.0
chinese          wang5:优,tian7:优,zhao6:优                                 70.0
----sql3
select course, str_to_map(concat_ws(',',collect_set(concat(name,':',grade))),',',':') as maps , avg(score) from stud group by course;
---结果:
course                               maps                              avg(score)	
math                 {"li4":"良","zhang3":"优","zhao6":"差"}             68.0
chinese              {"tian7":"优","wang5":"优","zhao6":"优"}            70.0

注意:

第一种sql,collect 字段的类型是array;第二种sql,strings字段的类型是string;第三种sql,maps字段的类型是map;

问题来了,能否在第二种的基础上,实现第一种和第三种的结果,且字段类型是string;

下面实现第二种转化为第三种,实际上就是map格式转换成json字符串;

sql 复制代码
with stud as
( select  'zhang3' as name ,'优' as grade  ,'math' as course ,'88' as score  
  union all 
  select  'li4' as name ,'良' as grade  ,'math' as course ,'72' as score
    union all 
  select  'zhao6' as name ,'差' as grade  ,'math' as course ,'44' as score
    union all 
  select  'wang5' as name ,'优' as grade  ,'chinese' as course ,'80' as score
    union all 
  select  'zhao6' as name ,'优' as grade  ,'chinese' as course ,'55' as score
    union all 
  select  'tian7' as name ,'优' as grade  ,'chinese' as course ,'75' as score
)

select 
course
,concat('{"',string2,'"}') as string3
from  
(select 
course
,regexp_replace(string1,'\\,','\\"\\,\\"') as string2
from  
(
select 
    course,
    concat_ws(',', collect_list(concat_ws('":"', k,v) ) ) as string1
from (
select course, str_to_map(concat_ws(',',collect_set(concat(name,':',grade))),',',':') as maps , avg(score) 
from stud group by course
)test_map_1
lateral view outer explode(maps) kv as k,v
group by course
) tt 
) tm 

---结果:
course                               string3                            	
math                 {"li4":"良","zhang3":"优","zhao6":"差"}           
chinese              {"tian7":"优","wang5":"优","zhao6":"优"}        
相关推荐
Yz98762 小时前
hive的存储格式
大数据·数据库·数据仓库·hive·hadoop·数据库开发
lzhlizihang2 小时前
python如何使用spark操作hive
hive·python·spark
武子康2 小时前
大数据-230 离线数仓 - ODS层的构建 Hive处理 UDF 与 SerDe 处理 与 当前总结
java·大数据·数据仓库·hive·hadoop·sql·hdfs
武子康2 小时前
大数据-231 离线数仓 - DWS 层、ADS 层的创建 Hive 执行脚本
java·大数据·数据仓库·hive·hadoop·mysql
锵锵锵锵~蒋3 小时前
实时数据开发 | 怎么通俗理解Flink容错机制,提到的checkpoint、barrier、Savepoint、sink都是什么
大数据·数据仓库·flink·实时数据开发
武子康11 小时前
Java-06 深入浅出 MyBatis - 一对一模型 SqlMapConfig 与 Mapper 详细讲解测试
java·开发语言·数据仓库·sql·mybatis·springboot·springcloud
JessieZeng aaa14 小时前
CSV文件数据导入hive
数据仓库·hive·hadoop
Yz987621 小时前
hive复杂数据类型Array & Map & Struct & 炸裂函数explode
大数据·数据库·数据仓库·hive·hadoop·数据库开发·big data
EDG Zmjjkk1 天前
Hive 函数(实例操作版2)
数据仓库·hive·hadoop
B站计算机毕业设计超人1 天前
计算机毕业设计SparkStreaming+Kafka新能源汽车推荐系统 汽车数据分析可视化大屏 新能源汽车推荐系统 汽车爬虫 汽车大数据 机器学习
数据仓库·爬虫·python·数据分析·kafka·数据可视化·推荐算法