力扣1225-报告系统状态的连续日期

表:Failed

复制代码
+--------------+---------+
| Column Name  | Type    |
+--------------+---------+
| fail_date    | date    |
+--------------+---------+
该表主键为 fail_date (具有唯一值的列)。
该表包含失败任务的天数.

表: Succeeded

复制代码
+--------------+---------+
| Column Name  | Type    |
+--------------+---------+
| success_date | date    |
+--------------+---------+
该表主键为 success_date (具有唯一值的列)。
该表包含成功任务的天数.

系统 每天 运行一个任务。每个任务都独立于先前的任务。任务的状态可以是失败或是成功。

编写解决方案找出 2019-01-012019-12-31 期间任务连续同状态 period_state 的起止日期(start_dateend_date)。即如果任务失败了,就是失败状态的起止日期,如果任务成功了,就是成功状态的起止日期。

最后结果按照起始日期 start_date 排序

返回结果样例如下所示:

示例 1:

复制代码
输入:
Failed table:
+-------------------+
| fail_date         |
+-------------------+
| 2018-12-28        |
| 2018-12-29        |
| 2019-01-04        |
| 2019-01-05        |
+-------------------+
Succeeded table:
+-------------------+
| success_date      |
+-------------------+
| 2018-12-30        |
| 2018-12-31        |
| 2019-01-01        |
| 2019-01-02        |
| 2019-01-03        |
| 2019-01-06        |
+-------------------+
输出:
+--------------+--------------+--------------+
| period_state | start_date   | end_date     |
+--------------+--------------+--------------+
| succeeded    | 2019-01-01   | 2019-01-03   |
| failed       | 2019-01-04   | 2019-01-05   |
| succeeded    | 2019-01-06   | 2019-01-06   |
+--------------+--------------+--------------+
解释:
结果忽略了 2018 年的记录,因为我们只关心从 2019-01-01 到 2019-12-31 的记录
从 2019-01-01 到 2019-01-03 所有任务成功,系统状态为 "succeeded"。
从 2019-01-04 到 2019-01-05 所有任务失败,系统状态为 "failed"。
从 2019-01-06 到 2019-01-06 所有任务成功,系统状态为 "succeeded"。

思路:

第一步,构造统一的时间状态表

将原本分散在不同表中的成功日期和失败日期,通过 UNION ALL 合并为一张统一的时间表。

在这一步中,明确每一条记录只包含两个核心信息:

  • 日期(date)

  • 状态(state:failed 或 succeeded)

这样做的目的是将问题从"多表问题"转化为"单一时间序列分组问题"。

第二步,通过窗口函数制造"分组标识"

连续区间的识别是本题的难点。

解决方法是利用两个行号序列的差值来刻画"连续性":

  • 一个行号按全局日期顺序递增

  • 一个行号按状态分组、日期顺序递增

当状态在日期序列中连续时,这两个行号的差值保持不变;

一旦状态发生切换,差值就会改变。

因此,全局行号 − 状态内行号 可以作为"连续区间的唯一标识"。

第三步,按区间标识进行分组聚合

在得到区间标识后,只需按以下维度分组:

  • 状态

  • 连续区间标识

然后分别取:

  • 最小日期作为区间起始

  • 最大日期作为区间结束

即可得到每一个状态连续区间的完整时间范围。

|--------------|----------|------------|-----------|----|------------|-----------|---------------------------------------------------------------------------------------------------|------------|-----------|-----|-----|---------|---------------------------------------------------------|--------------|------------|------------|---|
| 输入1 | | | | | | | | | | | | | | | | | |
| SUCCESS_DATE | | date | state | | date | state | | date | state | rn1 | rn2 | rn1-rn2 | | | | | |
| 2018-12-30 | | 2018-12-30 | succeeded | | 2018-12-28 | failed | row_number() over (order by dt) as rn1, row_number() over (partition by state order by dt) as rn2 | 2018-12-28 | failed | 1 | 1 | 0 | | date | state | | |
| 2018-12-31 | | 2018-12-31 | succeeded | | 2018-12-29 | failed | row_number() over (order by dt) as rn1, row_number() over (partition by state order by dt) as rn2 | 2018-12-29 | failed | 2 | 2 | 0 | where dt between date'2019-01-01' and date' 2019-12-31' | 2018-12-30 | succeeded | | |
| 2019-01-01 | | 2019-01-01 | succeeded | 排序 | 2018-12-30 | succeeded | row_number() over (order by dt) as rn1, row_number() over (partition by state order by dt) as rn2 | 2018-12-30 | succeeded | 3 | 1 | 2 | where dt between date'2019-01-01' and date' 2019-12-31' | 2018-12-31 | succeeded | group by state, (rn1 - rn2) order by start_date; ||
| 2019-01-02 | | 2019-01-02 | succeeded | 排序 | 2018-12-31 | succeeded | row_number() over (order by dt) as rn1, row_number() over (partition by state order by dt) as rn2 | 2018-12-31 | succeeded | 4 | 2 | 2 | where dt between date'2019-01-01' and date' 2019-12-31' | 2019-01-01 | succeeded | group by state, (rn1 - rn2) order by start_date; ||
| 2019-01-03 | 首先把两个表合并 | 2019-01-03 | succeeded | 排序 | 2019-01-01 | succeeded | row_number() over (order by dt) as rn1, row_number() over (partition by state order by dt) as rn2 | 2019-01-01 | succeeded | 5 | 3 | 2 | where dt between date'2019-01-01' and date' 2019-12-31' | 2019-01-02 | succeeded | group by state, (rn1 - rn2) order by start_date; ||
| 2019-01-06 | 首先把两个表合并 | 2019-01-06 | succeeded | 排序 | 2019-01-02 | succeeded | row_number() over (order by dt) as rn1, row_number() over (partition by state order by dt) as rn2 | 2019-01-02 | succeeded | 6 | 4 | 2 | where dt between date'2019-01-01' and date' 2019-12-31' | 2019-01-03 | succeeded | group by state, (rn1 - rn2) order by start_date; ||
| | 首先把两个表合并 | 2018-12-28 | failed | | 2019-01-03 | succeeded | row_number() over (order by dt) as rn1, row_number() over (partition by state order by dt) as rn2 | 2019-01-03 | succeeded | 7 | 5 | 2 | where dt between date'2019-01-01' and date' 2019-12-31' | 2019-01-04 | failed | group by state, (rn1 - rn2) order by start_date; ||
| 输入2 | 首先把两个表合并 | 2018-12-29 | failed | | 2019-01-04 | failed | row_number() over (order by dt) as rn1, row_number() over (partition by state order by dt) as rn2 | 2019-01-04 | failed | 8 | 3 | 5 | where dt between date'2019-01-01' and date' 2019-12-31' | 2019-01-05 | failed | group by state, (rn1 - rn2) order by start_date; ||
| FAIL_DATE | 首先把两个表合并 | 2019-01-04 | failed | | 2019-01-05 | failed | row_number() over (order by dt) as rn1, row_number() over (partition by state order by dt) as rn2 | 2019-01-05 | failed | 9 | 4 | 5 | | 2019-01-06 | succeeded | | |
| 2018-12-28 | | 2019-01-05 | failed | | 2019-01-06 | succeeded | | 2019-01-06 | succeeded | 10 | 6 | 4 | | | | | |
| 2018-12-29 | | | | | | | | | | (rn1 - rn2) ||| | | | | |
| 2019-01-04 | | | | | | | | | | (rn1 - rn2) ||| | period_state | start_date | end_date | |
| 2019-01-05 | | | | | | | | | | (rn1 - rn2) ||| | succeeded | 2019-01-01 | 2019-01-03 | |
| | | | | | | | | | | | | | | failed | 2019-01-04 | 2019-01-05 | |
| | | | | | | | | | | | | | | succeeded | 2019-01-06 | 2019-01-06 | |

代码:

sql 复制代码
 with t1 as (
    select to_char(fail_date,'YYYY-MM-DD') as dt, 'failed' as state
    from failed
    union all
    select to_char(success_date,'YYYY-MM-DD') as dt, 'succeeded' as state
    from succeeded
),
t as (
    select dt,
           state,
           row_number() over (order by dt) as rn1,
           row_number() over (partition by state order by dt) as rn2
    from t1
)
select state as period_state,
       min(dt) as start_date,
       max(dt) as end_date
from t
where dt between date'2019-01-01' and date' 2019-12-31'
group by state, (rn1 - rn2)
order by start_date;
相关推荐
进击的荆棘14 小时前
优选算法——滑动窗口
c++·算法·leetcode
csdn_aspnet14 小时前
奈飞工厂算法:个性化推荐系统的极限复刻
算法·netflix·奈飞
小白_ysf14 小时前
Vue 中常见的加密方法(对称、非对称、杂凑算法)
前端·vue.js·算法
Mr_Xuhhh15 小时前
MySQL核心知识梳理:从连接到查询的完整指南
数据库·sql·mysql
多米Domi01115 小时前
0x3f 第49天 面向实习的八股背诵第六天 过了一遍JVM的知识点,看了相关视频讲解JVM内存,垃圾清理,买了plus,稍微看了点确定一下方向
jvm·数据结构·python·算法·leetcode
wsxlgg15 小时前
MySQL中count(*)、count(1)、count(字段)的区别
数据库·mysql
pengdott1 天前
Oracle RAC内存融合技术深度解析:集群性能的幕后引擎
数据库·oracle
csudata1 天前
绿色便携版PostgreSQL发行版重磅发布
数据库·postgresql
阳光九叶草LXGZXJ1 天前
达梦数据库-学习-48-DmDrs控制台命令(同步之Manager、CPT模块)
linux·运维·数据库·sql·学习
我科绝伦(Huanhuan Zhou)1 天前
脚本再升级,兼容Oracle 26ai一键安装
数据库·oracle