【力扣 | SQL题 | 每日4题】力扣2004， 1454，1613，1709

1. 力扣2004：职员招聘人数

1.1 题目：

表: Candidates

复制代码

+-------------+------+
| Column Name | Type |
+-------------+------+
| employee_id | int  |
| experience  | enum |
| salary      | int  |
+-------------+------+
employee_id是此表的主键列。
经验是包含一个值（“高级”、“初级”）的枚举类型。
此表的每一行都显示候选人的id、月薪和经验。

一家公司想雇佣新员工。公司的工资预算是 70000 美元。公司的招聘标准是：

雇佣最多的高级员工。
在雇佣最多的高级员工后，使用剩余预算雇佣最多的初级员工。

编写一个SQL查询，查找根据上述标准雇佣的高级员工和初级员工的数量。

按 任意顺序 返回结果表。

查询结果格式如下例所示。

示例 1:

复制代码

输入: 
Candidates table:
+-------------+------------+--------+
| employee_id | experience | salary |
+-------------+------------+--------+
| 1           | Junior     | 10000  |
| 9           | Junior     | 10000  |
| 2           | Senior     | 20000  |
| 11          | Senior     | 20000  |
| 13          | Senior     | 50000  |
| 4           | Junior     | 40000  |
+-------------+------------+--------+
输出: 
+------------+---------------------+
| experience | accepted_candidates |
+------------+---------------------+
| Senior     | 2                   |
| Junior     | 2                   |
+------------+---------------------+
说明：
我们可以雇佣2名ID为（2,11）的高级员工。由于预算是7万美元，他们的工资总额是4万美元，我们还有3万美元，但他们不足以雇佣ID为13的高级员工。
我们可以雇佣2名ID为（1,9）的初级员工。由于剩下的预算是3万美元，他们的工资总额是2万美元，我们还有1万美元，但他们不足以雇佣ID为4的初级员工。

示例 2：

复制代码

输入: 
Candidates table:
+-------------+------------+--------+
| employee_id | experience | salary |
+-------------+------------+--------+
| 1           | Junior     | 10000  |
| 9           | Junior     | 10000  |
| 2           | Senior     | 80000  |
| 11          | Senior     | 80000  |
| 13          | Senior     | 80000  |
| 4           | Junior     | 40000  |
+-------------+------------+--------+
输出: 
+------------+---------------------+
| experience | accepted_candidates |
+------------+---------------------+
| Senior     | 0                   |
| Junior     | 3                   |
+------------+---------------------+
解释：
我们不能用目前的预算雇佣任何高级员工，因为我们需要至少80000美元来雇佣一名高级员工。
我们可以用剩下的预算雇佣三名初级员工。

1.2 思路：

使用了sum和rank两个窗口函数，并始终判断两种情况。

1.3 题解：

sql 复制代码

with tep1 as (
    -- 先得到所有的高级员工记录，并给出排名
    select employee_id, row_number() over (order by salary, employee_id) ranks
    , experience, salary
    from Candidates
    where experience = 'Senior'
), tep2 as (
    -- 先得到所有的低级员工记录，并给出排名
    select employee_id, row_number() over (order by salary, employee_id) ranks
    , experience, salary
    from Candidates
    where experience = 'Junior'
), tep3 as (
    -- 依据排名计算到该记录时的薪水总和
    select ranks, sum(salary) over (order by ranks) sums
    from tep1
), tep4 as (
    -- 依据排名计算到该记录时的薪水总和
    select ranks, sum(salary) over (order by ranks) sums
    from tep2
), tep5 as (
    -- 求出预算可以雇佣最多的高级员工
    -- max(ranks)可能为null，即where全过滤，给出默认值0
    select 'Senior' experience, ifNull(max(ranks), 0) accepted_candidates
    from tep3
    where sums <= 70000
)

select experience, accepted_candidates
from tep5

union all
-- 求出预算可以雇佣最多的初级员工
-- 需要用到两个ifNull函数
select 'Junior' experience, ifNull(max(ranks), 0) accepted_candidates
from tep4
where sums <= 70000-ifNull((select sums from tep3 where ranks = (select accepted_candidates from tep5)), 0)

2. 力扣1454：活跃用户

2.1 题目：

表 Accounts:

复制代码

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| id            | int     |
| name          | varchar |
+---------------+---------+
id 是该表主键（具有唯一值的列）
该表包含账户 id 和账户的用户名.

表 Logins:

复制代码

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| id            | int     |
| login_date    | date    |
+---------------+---------+
该表可能包含重复项.
该表包含登录用户的账户 id 和登录日期. 用户也许一天内登录多次.

活跃用户 是指那些至少连续 5 天登录账户的用户。

编写解决方案, 找到 活跃用户 的 id 和 name。

返回的结果表按照 id 排序。

结果表格式如下例所示。

示例 1：

复制代码

输入：
Accounts 表:
+----+----------+
| id | name     |
+----+----------+
| 1  | Winston  |
| 7  | Jonathan |
+----+----------+

Logins 表:
+----+------------+
| id | login_date |
+----+------------+
| 7  | 2020-05-30 |
| 1  | 2020-05-30 |
| 7  | 2020-05-31 |
| 7  | 2020-06-01 |
| 7  | 2020-06-02 |
| 7  | 2020-06-02 |
| 7  | 2020-06-03 |
| 1  | 2020-06-07 |
| 7  | 2020-06-10 |
+----+------------+
输出：
+----+----------+
| id | name     |
+----+----------+
| 7  | Jonathan |
+----+----------+
解释：
id = 1 的用户 Winston 仅仅在不同的 2 天内登录了 2 次, 所以, Winston 不是活跃用户.
id = 7 的用户 Jonathon 在不同的 6 天内登录了 7 次, , 6 天中有 5 天是连续的, 所以, Jonathan 是活跃用户.

进阶问题:

如果活跃用户是那些至少连续 n 天登录账户的用户, 你能否写出通用的解决方案?

2.2 思路：

连续 => 窗口函数LEAD

2.3 题解：

sql 复制代码

with tep1 as (
    -- 第一步先去重
    select distinct id, login_date
    from Logins
), tep2 as (
    -- 使用窗口函数 => lead函数 => 找到以该记录开始的后第四个记录
    select id, login_date, lead(login_date, 4, 0) over (partition by id order by login_date) lead_date
    from tep1
), tep3 as (
    -- 如果其第后4个记录的值与该记录的差值为4，那么就说明之间是连续的（因为美哟重复的）
    select distinct id
    from tep2
    where datediff(lead_date, login_date) = 4
)
select t1.id, name
from tep3 t1 
join Accounts t2 
on t1.id = t2.id
order by id

3. 力扣1613：找到遗失的ID

3.1 题目：

表: Customers

复制代码

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| customer_id   | int     |
| customer_name | varchar |
+---------------+---------+
customer_id 是该表主键.
该表第一行包含了顾客的名字和 id.

编写一个解决方案, 找到所有遗失的顾客 id。遗失的顾客 id 是指那些不在 Customers 表中, 值却处于 1 和表中最大 customer_id 之间的 id.

注意: 最大的 customer_id 值不会超过 100.

返回结果按 ids 升序排列

查询结果格式如下例所示。

示例 1:

复制代码

输入：
Customers 表:
+-------------+---------------+
| customer_id | customer_name |
+-------------+---------------+
| 1           | Alice         |
| 4           | Bob           |
| 5           | Charlie       |
+-------------+---------------+
输出：
+-----+
| ids |
+-----+
| 2   |
| 3   |
+-----+
解释：
表中最大的 customer_id 是 5, 所以在范围 [1,5] 内, ID2 和 3 从表中遗失.

3.2 思路：

使用递归CTE。

3.3 题解：

sql 复制代码

with recursive tep1 as (
    -- 使用递归的CTE，得到从1到表最大值的所有id
    select 1 n
    union all
    select n+1 n
    from tep1
    where n < (select max(customer_id) from Customers)
)

select n ids
from tep1
where n not in (select customer_id from Customers)
order by ids

4. 力扣1709：访问日期之间的最大的空档期

4.1 题目：

表： UserVisits

复制代码

+-------------+------+
| Column Name | Type |
+-------------+------+
| user_id     | int  |
| visit_date  | date |
+-------------+------+
该表没有主键，它可能有重复的行
该表包含用户访问某特定零售商的日期日志。

假设今天的日期是 '2021-1-1' 。

编写解决方案，对于每个 user_id ，求出每次访问及其下一个访问（若该次访问是最后一次，则为今天）之间最大的空档期天数 window 。

返回结果表，按用户编号 user_id 排序。

结果格式如下示例所示：

示例 1：

复制代码

输入：
UserVisits 表：
+---------+------------+
| user_id | visit_date |
+---------+------------+
| 1       | 2020-11-28 |
| 1       | 2020-10-20 |
| 1       | 2020-12-3  |
| 2       | 2020-10-5  |
| 2       | 2020-12-9  |
| 3       | 2020-11-11 |
+---------+------------+
输出：
+---------+---------------+
| user_id | biggest_window|
+---------+---------------+
| 1       | 39            |
| 2       | 65            |
| 3       | 51            |
+---------+---------------+
解释：
对于第一个用户，问题中的空档期在以下日期之间：
    - 2020-10-20 至 2020-11-28 ，共计 39 天。
    - 2020-11-28 至 2020-12-3 ，共计 5 天。
    - 2020-12-3 至 2021-1-1 ，共计 29 天。
由此得出，最大的空档期为 39 天。
对于第二个用户，问题中的空档期在以下日期之间：
    - 2020-10-5 至 2020-12-9 ，共计 65 天。
    - 2020-12-9 至 2021-1-1 ，共计 23 天。
由此得出，最大的空档期为 65 天。
对于第三个用户，问题中的唯一空档期在 2020-11-11 至 2021-1-1 之间，共计 51 天。

4.2 思路：

求该记录与下一个记录的差值，且需要默认值 => 窗口函数LEAD

4.3 题解：

sql 复制代码

-- 一眼窗口函数LEAD
with tep1 as (
    -- lead函数得到该记录分组排序后的下一条记录。
    -- 遇到最后一个记录没有下一个记录的时候，则默认值是今天
    select user_id, visit_date, lead(visit_date, 1, '2021-1-1') over (partition by user_id order by visit_date) lead_date
    from UserVisits
), tep2 as (
    -- 然后求该日期与下一条日期的差值
    select user_id, datediff(lead_date, visit_date) diff
    from tep1
)
-- 然后求最大差值
select user_id, max(diff) biggest_window
from tep2
group by user_id