Hive的CTE 公共表达式

目录

1.语法

[2. 使用场景](#2. 使用场景)

select语句

[chaining CTEs 链式](#chaining CTEs 链式)

union语句

[insert into 语句](#insert into 语句)

[create table as 语句](#create table as 语句)

前言

Common Table Expressions(CTE):公共表达式是一个临时的结果集,该结果集是从with子句中指定的查询派生而来的,紧跟在select 或 insert关键字之前。CTE可以在 select,insert, create table as select 等语句中使用。

1.语法

sql 复制代码
[wtih CommonTableExpression]
select
        column1,
        column2, ...
from table 
[where 条件] 
[group by column]
[order by column] 
[cluster by column| [distribute by column] [sort by column] 
[limit [offset,] rows];

2. 使用场景

select语句

sql 复制代码
with tmp as (
    select
        oid,
        uid,
        otime,
        date_format(otime, 'yyyy-MM') as dt,
        oamount,
        ---计算rk的目的是为了获取记录中的第一条
        row_number() over (partition by uid,date_format(otime, 'yyyy-MM') order by otime) rk
    from t_order
)
 select
    uid,
    --每个用户一月份的订单数
    sum(if(dt = '2018-01', 1, 0)) as  m1_count,
    --每个用户二月份的订单数
    sum(if(dt = '2018-02', 1, 0)) as  m2_count
from tmp
 group by uid
 having m1_count >0 and m2_count=0;

chaining CTEs 链式

sql 复制代码
with tmp1 as (
    select
        oid,
        uid,
        otime,
        date_format(otime, 'yyyy-MM') as dt,
        oamount,
        ---计算rk的目的是为了获取记录中的第一条
        row_number() over (partition by uid,date_format(otime, 'yyyy-MM') order by otime) as rk
    from t_order
),
     tmp2 as
         (select
              uid,
              --每个用户一月份的订单数
              sum(if(dt = '2018-01', 1, 0)) as m1_count,
              --每个用户二月份的订单数
              sum(if(dt = '2018-02', 1, 0)) as m2_count
          from tmp1
          group by uid
          having m1_count > 0
             and m2_count = 0)
select * from tmp2 limit 1;

union语句

sql 复制代码
with q1 as (select * from student where num = 95002),
     q2 as (select * from student where num = 95004)
select * from q1 union all select * from q2;

insert into 语句

sql 复制代码
with tmp1 as (
    select
        oid,
        uid,
        otime,
        date_format(otime, 'yyyy-MM') as dt,
        oamount,
        ---计算rk的目的是为了获取记录中的第一条
        row_number() over (partition by uid,date_format(otime, 'yyyy-MM') order by otime) as rk
    from t_order
),
     tmp2 as
         (select
              uid,
              --每个用户一月份的订单数
              sum(if(dt = '2018-01', 1, 0)) as m1_count,
              --每个用户二月份的订单数
              sum(if(dt = '2018-02', 1, 0)) as m2_count
          from tmp1
          group by uid
          having m1_count > 0
             and m2_count = 0)

insert into tmp3
select * from tmp2 limit 10;

create table as 语句

sql 复制代码
--- 从tmp2 表中取10条数据,基于此创建表tmp3 
create table tmp3 as 
with tmp1 as (
    select
        oid,
        uid,
        otime,
        date_format(otime, 'yyyy-MM') as dt,
        oamount,
        ---计算rk的目的是为了获取记录中的第一条
        row_number() over (partition by uid,date_format(otime, 'yyyy-MM') order by otime) as rk
    from t_order
),
     tmp2 as
         (select
              uid,
              --每个用户一月份的订单数
              sum(if(dt = '2018-01', 1, 0)) as m1_count,
              --每个用户二月份的订单数
              sum(if(dt = '2018-02', 1, 0)) as m2_count
          from tmp1
          group by uid
          having m1_count > 0
             and m2_count = 0)
select * from tmp2 limit 10;
相关推荐
vxtkjzxt8882 天前
手机群控平台的核心功能
数据库·数据仓库
core5123 天前
Hive实战(三)
数据仓库·hive·hadoop
程序员小羊!3 天前
大数据电商流量分析项目实战:Hive 数据仓库(三)
大数据·数据仓库·hive
core5124 天前
Hive实战(一)
数据仓库·hive·hadoop·架构·实战·配置·场景
智海观潮4 天前
Spark SQL解析查询parquet格式Hive表获取分区字段和查询条件
hive·sql·spark
cxr8284 天前
基于Claude Code的 规范驱动开发(SDD)指南
人工智能·hive·驱动开发·敏捷流程·智能体
core5125 天前
Hive实战(二)
数据仓库·hive·hadoop
Agatha方艺璇6 天前
Hive基础简介
数据仓库·hive·hadoop
Leo.yuan6 天前
不同数据仓库模型有什么不同?企业如何选择适合的数据仓库模型?
大数据·数据库·数据仓库·信息可视化·spark