目录
hive官方函数解释
hive官网函数大全地址: hive官网函数大全地址
Return Type | Name | Description |
---|---|---|
array | array(value1, value2, ...) | Creates an array with the given elements. |
boolean | array_contains(Array, value) | Returns TRUE if the array contains value. |
array | sort_array(Array) | Sorts the input array in ascending order according to the natural ordering of the array elements and returns it (as of version 0.9.0). |
T | explode(ARRAY a) | Explodes an array to multiple rows. Returns a row-set with a single column (col), one row for each element from the array. |
int,T | posexplode(ARRAY a) | Explodes an array to multiple rows with additional positional column of int type (position of items in the original array, starting with 0). Returns a row-set with two columns (pos,val), one row for each element from the array. |
示例
1、array(value1, value2, ...)
创建一个 Array 对象,由一组值组成。
sql
SELECT array(85, 90, 78, 92, 86) AS scores;
---结果
scores
[85,90,78,92,86]
2、array_contains(Array, value)
判断一个值是否在 Array 对象中存在。
sql
SELECT array_contains(array(85, 90, 78, 92, 86), 60) AS is_passed;
---结果
is_passed
false
SELECT if(array_contains(array(85, 90, 78, 92, 86), 60),1,0) AS is_passed;
---结果
is_passed
0
3、sort_array(Array)
sort_array 函数可以用于对 Array 对象中的元素进行排序。具体来说,sort_array 函数会将 Array 对象中的元素按照升序顺序进行排序,并返回一个新的排序后的 Array 对象。
sql
SELECT sort_array(array(85, 90, 78, 92, 86)) AS sorted_scores;
---结果
sorted_scores
[78,85,86,90,92]
4、explode (array)
explode 函数可以用于将一个 Array 对象拆分成多行。具体来说,explode 函数会将 Array 对象中的每个元素拆分成一行,并与原始数据集中的其他字段一起返回。
sql
select explode(array('A','B','C'));
select explode(array('A','B','C')) as col;
select tf.* from (select 0) t lateral view explode(array('A','B','C')) tf;
select tf.* from (select 0) t lateral view explode(array('A','B','C')) tf as col;
---上述四个结果均为
col
A
B
C
5、posexplode (array)
sql
select posexplode(array('A','B','C'));
select posexplode(array('A','B','C')) as (pos,val);
select tf.* from (select 0) t lateral view posexplode(array('A','B','C')) tf;
select tf.* from (select 0) t lateral view posexplode(array('A','B','C')) tf as pos,val;
---上述四个结果均为
pos col
0 A
1 B
2 C
实战
下面全年级的学生成绩单的部分,包含学生姓名,课程,分数,年级排名,要求将每人的成绩进行汇总整理:按排名,课程和成绩的顺序
数据:
powershell
'小明' ,'语文' ,80 as score , 100
'小明' ,'数学' ,90 as score, 85
'小明' ,'英语' ,75 as score, 203
'小花' ,'语文' ,85 as score , 90
'小花' ,'数学' ,65 as score , 350
'小花' ,'英语' ,90 as score , 20
代码实现:
sql
with students as(
select '小明' as name,'语文' as course ,80 as score , 100 as rk
union all
select '小明' as name,'数学' as course ,90 as score, 85 as rk
union all
select '小明' as name,'英语' as course ,75 as score, 203 as rk
union all
select '小花' as name,'语文' as course ,85 as score , 90 as rk
union all
select '小花' as name,'数学' as course ,65 as score , 350 as rk
union all
select '小花' as name,'英语' as course ,90 as score , 20 as rk
)
SELECT name,
concat_ws(',',
sort_array(
collect_list(
concat_ws(':',
lpad(cast(rk as string), 4, '0'),
course,
cast(score as string)
)
)
)
) AS sorted_scores
FROM students
GROUP BY name;
---结果
name sorted_scores
小明 0085:数学:90,0100:语文:80,0203:英语:75
小花 0020:英语:90,0090:语文:85,0350:数学:65