Hive数据仓库行转列

查了很多资料发现网上很多文章都是转发和抄袭,有些问题。这里分享一个自己项目中使用的行转列例子,供大家参考。代码如下:

sql 复制代码
SELECT
  my_id,
  nm_cd_map['A'] AS my_cd_a,
  nm_cd_map['B'] AS my_cd_b,
  nm_cd_map['C'] AS my_cd_c,
  nm_num_map['A'] AS my_num_a,
  nm_num_map['B'] AS my_num_b,
  nm_num_map['C'] AS my_num_c
FROM
  (
    SELECT
      t.my_id,
      STR_TO_MAP(my_nm_cds,';',':') AS nm_cd_map,
      STR_TO_MAP(my_nm_nums,';',':') AS nm_num_map
    FROM
      (
        SELECT
          my_id,
          CONCAT_WS(';',COLLECT_LIST(CONCAT(my_nm,':',my_cd))) AS my_nm_cds,
          CONCAT_WS(';',COLLECT_LIST(CONCAT(my_nm,':',my_num))) AS my_nm_nums
        FROM
          (
            SELECT '1' AS my_id,'A' AS my_nm,'D01' AS my_cd,19 AS my_num
            UNION ALL
            SELECT '1' AS my_id,'B' AS my_nm,'D04' AS my_cd,18 AS my_num
            UNION ALL
            SELECT '1' AS my_id,'C' AS my_nm,'D02' AS my_cd,17 AS my_num
            UNION ALL
            SELECT '2' AS my_id,'A' AS my_nm,'D03' AS my_cd,16 AS my_num
            UNION ALL
            SELECT '2' AS my_id,'B' AS my_nm,'D05' AS my_cd,15 AS my_num
            UNION ALL
            SELECT '2' AS my_id,'C' AS my_nm,'D06' AS my_cd,14 AS my_num
          )
        GROUP BY my_id
      ) t
  ) t
WHERE 1=1;

如果是在SparkSQL或Presto平台,或者阿里云的MaxCompute平台,还可使用如下方式:

sql 复制代码
-- 其实也可使用CONCAT然后STR_TO_MAP的方式,或者用MAP_FROM_ARRAYS,再或者用数组排序后ARRAY[n] AS的方式
SELECT
  my_id,
  nm_cd_map['A'] AS my_cd_a,
  nm_cd_map['B'] AS my_cd_b,
  nm_cd_map['C'] AS my_cd_c,
  nm_num_map['A'] AS my_num_a,
  nm_num_map['B'] AS my_num_b,
  nm_num_map['C'] AS my_num_c
FROM
  (
    SELECT
      t.my_id,
      MAP_FROM_ENTRIES(COLLECT_LIST(nm_cd)) AS nm_cd_map,
      MAP_FROM_ENTRIES(COLLECT_LIST(nm_num)) AS nm_num_map
    FROM
      (
        SELECT
          my_id,
          my_nm,
          my_cd,
          my_num,
          STRUCT(my_nm,my_cd) AS nm_cd,
          STRUCT(my_nm,my_num) AS nm_num
        FROM
          (
            SELECT '1' AS my_id,'A' AS my_nm,'D01' AS my_cd,19 AS my_num
            UNION ALL
            SELECT '1' AS my_id,'B' AS my_nm,'D04' AS my_cd,18 AS my_num
            UNION ALL
            SELECT '1' AS my_id,'C' AS my_nm,'D02' AS my_cd,17 AS my_num
            UNION ALL
            SELECT '2' AS my_id,'A' AS my_nm,'D03' AS my_cd,16 AS my_num
            UNION ALL
            SELECT '2' AS my_id,'B' AS my_nm,'D05' AS my_cd,15 AS my_num
            UNION ALL
            SELECT '2' AS my_id,'C' AS my_nm,'D06' AS my_cd,14 AS my_num
          )
      ) t
    GROUP BY my_id
  ) t
WHERE 1=1;
相关推荐
飞Link19 小时前
【Sqoop】Linux(CentOS7)下安装Sqoop教程
linux·hive·hadoop·sqoop
飞Link20 小时前
【Hive】Linux(CentOS7)下安装Hive教程
大数据·linux·数据仓库·hive·hadoop
心止水j1 天前
hbase 电商1
hive
小鸡脚来咯1 天前
Hive分桶表:大数据开发的性能优化利器
大数据·hive·性能优化
木卫二号Coding1 天前
hivesql 字段aa值 如何去掉前面的0
hive
yumgpkpm2 天前
Cloudera CDP 7.3(国产CMP 鲲鹏版)平台与银行五大平台的技术对接方案
大数据·人工智能·hive·zookeeper·flink·kafka·cloudera
默 语4 天前
Spring Boot 3.x升级踩坑记:到底值不值得升级?
hive·spring boot·后端
ha_lydms4 天前
2、Spark 函数_a/b/c
大数据·c语言·hive·spark·时序数据库·dataworks·数据开发
本旺6 天前
【数据开发离谱场景记录】Hive + ES 复杂查询场景处理
hive·hadoop·elasticsearch
悟能不能悟6 天前
springboot全局异常
大数据·hive·spring boot