1、任务
获取邮箱字符串'@'后字符串 ,求长度
2、hive & spark-sql 求数组长度的函数 size
sql
hive & spark-sql 求数组长度的函数 size
select size(split(email, '@')),split(email, '@'),split(email, '@')[0],split(email, '@')[1]
FROM
(select "jack@126.com" as email union select "tom@126.com.cn" as email) tb_mid;
select size(split(email, '@')),split(email, '@'),split(email, '@')[0],split(email, '@')[1]
FROM
(select 'jack@126.com' as email union select 'tom@126.com.cn' as email) tb_mid;
2 ["tom","126.com.cn"] tom 126.com.cn
2 ["jack","126.com"] jack 126.com
Time taken: 0.723 seconds, Fetched 2 row(s)
3、presto 求数组长度的函数 cardinality
sql
presto 求数组长度的函数 cardinality
select cardinality(split(email, '@')),split(email, '@'),split(email, '@')[1],split(email, '@')[2]
FROM
(select 'jack@126.com' as email union select 'tom@126.com.cn' as email) tb_mid;
_col0 | _col1 | _col2 | _col3
-------+-------------------+-------+------------
2 | [tom, 126.com.cn] | tom | 126.com.cn
2 | [jack, 126.com] | jack | 126.com
(2 rows)
select cardinality(split(email, '@')),split(email, '@'),split(email, '@')[1],split(email, '@')[2]
FROM
(select "jack@126.com" as email union select "tom@126.com.cn" as email) tb_mid;
Query 20231019_070945_20009_n9u2s failed: line 3:9: Column 'jack@126.com' cannot be resolved
select cardinality(split(email, '@')),split(email, '@'),split(email, '@')[1],split(email, '@')[2]
FROM
(select "jack@126.com" as email union select "tom@126.com.cn" as email) tb_mid
4、注意事项
1)、在计算数组长度的时候,hive和presto的函数不同
其中hive的size函数默认数组的下标从0开始
presto的cardinality函数默认数组的下标从1开始
2)、presto 不支持双引号 ,而hive 既支持单引号,也支持双引号
sql
presto> SELECT
-> email,
-> (case when cardinality(split(email, '@')) = 2 then split(email, '@')[1] else '' end ) as email_suffix
-> FROM
-> (select "jack@126.com" as email union select "tom@126.com.cn" as email) tb_mid;
Query 20231016_070153_17958_p9f2s failed: line 5:9: Column 'jack@126.com' cannot be resolved
SELECT
email,
(case when cardinality(split(email, '@')) = 2 then split(email, '@')[1] else '' end ) as email_suffix
FROM
(select "jack@126.com" as email union select "tom@126.com.cn" as email) tb_mid