Hive的CTE 公共表达式

目录

1.语法

[2. 使用场景](#2. 使用场景)

select语句

[chaining CTEs 链式](#chaining CTEs 链式)

union语句

[insert into 语句](#insert into 语句)

[create table as 语句](#create table as 语句)

前言

Common Table Expressions(CTE):公共表达式是一个临时的结果集,该结果集是从with子句中指定的查询派生而来的,紧跟在select 或 insert关键字之前。CTE可以在 select,insert, create table as select 等语句中使用。

1.语法

sql 复制代码
[wtih CommonTableExpression]
select
        column1,
        column2, ...
from table 
[where 条件] 
[group by column]
[order by column] 
[cluster by column| [distribute by column] [sort by column] 
[limit [offset,] rows];

2. 使用场景

select语句

sql 复制代码
with tmp as (
    select
        oid,
        uid,
        otime,
        date_format(otime, 'yyyy-MM') as dt,
        oamount,
        ---计算rk的目的是为了获取记录中的第一条
        row_number() over (partition by uid,date_format(otime, 'yyyy-MM') order by otime) rk
    from t_order
)
 select
    uid,
    --每个用户一月份的订单数
    sum(if(dt = '2018-01', 1, 0)) as  m1_count,
    --每个用户二月份的订单数
    sum(if(dt = '2018-02', 1, 0)) as  m2_count
from tmp
 group by uid
 having m1_count >0 and m2_count=0;

chaining CTEs 链式

sql 复制代码
with tmp1 as (
    select
        oid,
        uid,
        otime,
        date_format(otime, 'yyyy-MM') as dt,
        oamount,
        ---计算rk的目的是为了获取记录中的第一条
        row_number() over (partition by uid,date_format(otime, 'yyyy-MM') order by otime) as rk
    from t_order
),
     tmp2 as
         (select
              uid,
              --每个用户一月份的订单数
              sum(if(dt = '2018-01', 1, 0)) as m1_count,
              --每个用户二月份的订单数
              sum(if(dt = '2018-02', 1, 0)) as m2_count
          from tmp1
          group by uid
          having m1_count > 0
             and m2_count = 0)
select * from tmp2 limit 1;

union语句

sql 复制代码
with q1 as (select * from student where num = 95002),
     q2 as (select * from student where num = 95004)
select * from q1 union all select * from q2;

insert into 语句

sql 复制代码
with tmp1 as (
    select
        oid,
        uid,
        otime,
        date_format(otime, 'yyyy-MM') as dt,
        oamount,
        ---计算rk的目的是为了获取记录中的第一条
        row_number() over (partition by uid,date_format(otime, 'yyyy-MM') order by otime) as rk
    from t_order
),
     tmp2 as
         (select
              uid,
              --每个用户一月份的订单数
              sum(if(dt = '2018-01', 1, 0)) as m1_count,
              --每个用户二月份的订单数
              sum(if(dt = '2018-02', 1, 0)) as m2_count
          from tmp1
          group by uid
          having m1_count > 0
             and m2_count = 0)

insert into tmp3
select * from tmp2 limit 10;

create table as 语句

sql 复制代码
--- 从tmp2 表中取10条数据,基于此创建表tmp3 
create table tmp3 as 
with tmp1 as (
    select
        oid,
        uid,
        otime,
        date_format(otime, 'yyyy-MM') as dt,
        oamount,
        ---计算rk的目的是为了获取记录中的第一条
        row_number() over (partition by uid,date_format(otime, 'yyyy-MM') order by otime) as rk
    from t_order
),
     tmp2 as
         (select
              uid,
              --每个用户一月份的订单数
              sum(if(dt = '2018-01', 1, 0)) as m1_count,
              --每个用户二月份的订单数
              sum(if(dt = '2018-02', 1, 0)) as m2_count
          from tmp1
          group by uid
          having m1_count > 0
             and m2_count = 0)
select * from tmp2 limit 10;
相关推荐
想ai抽9 小时前
深入starrocks-多列联合统计一致性探查与策略(YY一下)
java·数据库·数据仓库
starfalling102410 小时前
【hive】一种高效增量表的实现
hive
D明明就是我14 小时前
Hive 拉链表
数据仓库·hive·hadoop
嘉禾望岗50318 小时前
hive join优化和数据倾斜处理
数据仓库·hive·hadoop
yumgpkpm18 小时前
华为鲲鹏 Aarch64 环境下多 Oracle 数据库汇聚操作指南 CMP(类 Cloudera CDP 7.3)
大数据·hive·hadoop·elasticsearch·zookeeper·big data·cloudera
忧郁火龙果20 小时前
六、Hive的基本使用
数据仓库·hive·hadoop
忧郁火龙果20 小时前
五、安装配置hive
数据仓库·hive·hadoop
chad__chang1 天前
dolphinscheduler安装过程
hive·hadoop
莫叫石榴姐2 天前
字节数开一面
大数据·数据仓库·职场和发展