presto高级用法(grouping、grouping sets)

目录

准备工作:

在hive中建表

在presto中计算

分解式

[按照城市分组 统计人数](#按照城市分组 统计人数)

[按照性别分组 统计人数](#按照性别分组 统计人数)

​编辑

[按照爱好分组 统计人数](#按照爱好分组 统计人数)

​编辑

[按照城市和性别分组 统计人数](#按照城市和性别分组 统计人数)

[按照城市和爱好分组 统计人数](#按照城市和爱好分组 统计人数)

[按照性别和爱好分组 统计人数](#按照性别和爱好分组 统计人数)

[按照城市和性别还有爱好分组 统计人数](#按照城市和性别还有爱好分组 统计人数)

统计人数

合并式

presto使用grouping

[presto使用grouping sets](#presto使用grouping sets)

grouping作用例子展示

[高级用法: cube](#高级用法: cube)

[rollup 用法](#rollup 用法)


准备工作:

在hive中建表
sql 复制代码
drop database if exists db_test cascade;

create database db_test;

create table db_test.tb_student(
    name string,
    score   int,
    city    string,
    sex string,
    hobby string
)
row format delimited fields terminated by '\t';

load data local inpath '/test/student.txt' into table db_test.tb_student;

select * from db_test.tb_student;

student.txt数据

张三 10 北京 男 喝酒

李四 20 北京 男 抽烟

王五 30 北京 女 烫头

赵六 40 上海 男 抽烟

麻七 50 上海 女 烫头

在presto中计算

分解式
按照城市分组 统计人数
sql 复制代码
select city,count(1) as cnt from hive.db_test.tb_student group by city;
按照性别分组 统计人数
sql 复制代码
select hobby,count(1) as cnt from hive.db_test.tb_student group by hobby;
按照爱好分组 统计人数
sql 复制代码
select hobby,count(1) as cnt from hive.db_test.tb_student group by hobby;
按照城市和性别分组 统计人数
sql 复制代码
select city, sex, count(1) as cnt from hive.db_test.tb_student group by city, sex;
按照城市和爱好分组 统计人数
sql 复制代码
select city, hobby, count(1) as cnt from hive.db_test.tb_student group by city, hobby;
按照性别和爱好分组 统计人数
sql 复制代码
select sex, hobby, count(1) as cnt from hive.db_test.tb_student group by sex, hobby;
按照城市和性别还有爱好分组 统计人数
sql 复制代码
select city, sex, hobby, count(1) as cnt from hive.db_test.tb_student group by city, sex, hobby;
统计人数
sql 复制代码
select count(1) as cnt from hive.db_test.tb_student group by ();
合并式
sql 复制代码
with t1 as (
    select city, null as sex, null as hobby, count(1) as cnt, 1 as o from hive.db_test.tb_student group by city
    union all
    select null as city, sex, null as hobby, count(1) as cnt, 2 as o from hive.db_test.tb_student group by sex
    union all
    select null, null, hobby,count(1) as cnt, 3 as o from hive.db_test.tb_student group by hobby
    union all
    select city, sex, null, count(1) as cnt, 4 as o from hive.db_test.tb_student group by city, sex
    union all
    select city, null, hobby, count(1) as cnt, 5 as o from hive.db_test.tb_student group by city, hobby
    union all
    select null, sex, hobby, count(1) as cnt, 6 as o from hive.db_test.tb_student group by sex, hobby
    union all
    select city, sex, hobby, count(1) as cnt, 7 as o from hive.db_test.tb_student group by city, sex, hobby
    union all
    select null, null, null, count(1) as cnt, 8 as o from hive.db_test.tb_student group by ()
)
select * from t1
order by o, city, sex, hobby
;

presto使用grouping

sql 复制代码
select
    city,
    sex,
    count(1) as cnt,
    grouping(city, sex) as g
from hive.db_test.tb_student
group by city, sex
;

presto使用grouping sets

sql 复制代码
select
    city,
    sex,
    hobby,
    count(1) as cnt,
    grouping(city, sex, hobby)
from hive.db_test.tb_student
group by grouping sets (city, sex, hobby)
;
sql 复制代码
select
    city,
    sex,
    hobby,
    count(1) as cnt,
    grouping(city, sex, hobby)
from hive.db_test.tb_student
group by grouping sets (city, sex, hobby, (city, sex), (city, hobby), (sex, hobby), (city, sex, hobby), ())
;
sql 复制代码
select
    city,
    sex,
    hobby,
    count(1) as cnt,
    case
        when grouping(city, sex, hobby)=3 then 1
        when grouping(city, sex, hobby)=5 then 2
        when grouping(city, sex, hobby)=6 then 3
        when grouping(city, sex, hobby)=1 then 4
        when grouping(city, sex, hobby)=2 then 5
        when grouping(city, sex, hobby)=4 then 6
        when grouping(city, sex, hobby)=0 then 7
        when grouping(city, sex, hobby)=7 then 8
        else 100
    end as o
from hive.db_test.tb_student
group by grouping sets (city, sex, hobby, (city, sex), (city, hobby), (sex, hobby), (city, sex, hobby), ())
order by o, city, sex, hobby
;

grouping作用例子展示

sql 复制代码
with t1 as (
    select '北京' as city, '男' as sex
    union all
    select '北京' as city, '男' as sex
    union all
    select '北京' as city, '女' as sex
    union all
    select '北京' as city, null as sex
)
select
    city,
    sex,
    count(1) as cnt
from t1
group by grouping sets (city, (city, sex))
复制代码
问题:
    city=北京, sex=null, cnt=4
    city=北京, sex=null, cnt=1
    为什么 city 和 sex 的值一样, 但是结果不同?
原因:
    一个null 表示跟这一列没有关系
    另一个null 表示 这一列的值 为null, 根据 列值统计的结果
    怎么区分
解决方案:
    grouping(city, sex)
        0,0     两个都有关
        0,1     只跟city有关
        1,0     只跟sex有关
        1,1     都这两列都无关
sql 复制代码
with t1 as (
    select '北京' as city, '男' as sex
    union all
    select '北京' as city, '男' as sex
    union all
    select '北京' as city, '女' as sex
    union all
    select '北京' as city, null as sex
)
select
    city,
    sex,
    count(1) as cnt,
    grouping(city, sex) g
from t1
group by grouping sets (city, (city, sex))
sql 复制代码
select
    city,
    sex,
    hobby,
    count(1) as cnt,
    case
        when grouping(city, sex, hobby)=3 then 1
        when grouping(city, sex, hobby)=5 then 2
        when grouping(city, sex, hobby)=6 then 3
        when grouping(city, sex, hobby)=1 then 4
        when grouping(city, sex, hobby)=2 then 5
        when grouping(city, sex, hobby)=4 then 6
        when grouping(city, sex, hobby)=0 then 7
        when grouping(city, sex, hobby)=7 then 8
        else 100
    end as o
from hive.db_test.tb_student
group by grouping sets (city, sex, hobby, (city, sex), (city, hobby), (sex, hobby), (city, sex, hobby), ())
order by o, city, sex, hobby

高级用法: cube

sql 复制代码
select
    city,
    sex,
    hobby,
    count(1) as cnt,
    case
        when grouping(city, sex, hobby)=3 then 1
        when grouping(city, sex, hobby)=5 then 2
        when grouping(city, sex, hobby)=6 then 3
        when grouping(city, sex, hobby)=1 then 4
        when grouping(city, sex, hobby)=2 then 5
        when grouping(city, sex, hobby)=4 then 6
        when grouping(city, sex, hobby)=0 then 7
        when grouping(city, sex, hobby)=7 then 8
        else 100
    end as o
from hive.db_test.tb_student
group by cube(city, sex, hobby)
order by o, city, sex, hobby

rollup 用法

sql 复制代码
select
    city,
    sex,
    hobby,
    count(1) as cnt,
    case
        when grouping(city, sex, hobby)=3 then 1
        when grouping(city, sex, hobby)=5 then 2
        when grouping(city, sex, hobby)=6 then 3
        when grouping(city, sex, hobby)=1 then 4
        when grouping(city, sex, hobby)=2 then 5
        when grouping(city, sex, hobby)=4 then 6
        when grouping(city, sex, hobby)=0 then 7
        when grouping(city, sex, hobby)=7 then 8
        else 100
    end as o
from hive.db_test.tb_student
group by rollup(city, sex, hobby)
order by o, city, sex, hobby
;

总结:

presto时间函数:

date()类型 表示 年月日

timestamp类型表示 年月日时分秒

eg:timestamp('2024-08-18 22:13:10','%Y-%m-%d %H%i%s')

date_add(unit, value,timestamp)

grouping sets()相当于一个集合 都能根据括号里的内容分组查询到相应的数据

grouping 根据8421码 0表示与该列有关系1表示无关 通过计算数值 查看与列之间分组的关系

cube(city, sex, hobby) 等价于 grouping sets (city, sex, hobby, (city, sex), (city, hobby), (sex, hobby), (city, sex, hobby), ())

rollup (city, sex, name) 等价于 grouping set((city, sex, name), (city, sex), city, ())

相关推荐
Apple_羊先森5 分钟前
ORACLE数据库巡检SQL脚本--19、磁盘读次数最高的前5条SQL语句
数据库·sql·oracle
全栈前端老曹42 分钟前
【MongoDB】Node.js 集成 —— Mongoose ORM、Schema 设计、Model 操作
前端·javascript·数据库·mongodb·node.js·nosql·全栈
神梦流1 小时前
ops-math 算子库的扩展能力:高精度与复数运算的硬件映射策略
服务器·数据库
让学习成为一种生活方式1 小时前
trf v4.09.1 安装与使用--生信工具42-version2
数据库
啦啦啦_99991 小时前
Redis-5-doFormatAsync()方法
数据库·redis·c#
生产队队长1 小时前
Redis:Windows环境安装Redis,并将 Redis 进程注册为服务
数据库·redis·缓存
老邓计算机毕设1 小时前
SSM找学互助系统52568(程序+源码+数据库+调试部署+开发环境)带论文文档1万字以上,文末可获取,系统界面在最后面
数据库·ssm 框架·javaweb 毕业设计
痴儿哈哈1 小时前
自动化机器学习(AutoML)库TPOT使用指南
jvm·数据库·python
Σίσυφος19002 小时前
PCL法向量估计 之 方向约束法向量(Orientation Guided Normal)
数据库
老毛肚2 小时前
手写mybatis
java·数据库·mybatis