【力扣 | SQL题 | 每日3题】力扣2988，569，1132，1158

1 hard + 3mid，难度不是特别大。

1. 力扣2988：最大部门的经理

1.1 题目：

表： Employees

复制代码

+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| emp_id      | int     |
| emp_name    | varchar |
| dep_id      | int     |
| position    | varchar |
+-------------+---------+
emp_id 是这张表具有唯一值的列。
这张表包括 emp_id, emp_name, dep_id,和 position。

查询 最大部门 的经理的名字。当拥有相同数量的员工时，可能会有多个最大部门。

返回按照 dep_id 升序排列的结果表格。

结果表格的格式如下例所示。

示例 1:

复制代码

输入：
Employees table:
+--------+----------+--------+---------------+
| emp_id | emp_name | dep_id | position      | 
+--------+----------+--------+---------------+
| 156    | Michael  | 107    | Manager       |
| 112    | Lucas    | 107    | Consultant    |    
| 8      | Isabella | 101    | Manager       | 
| 160    | Joseph   | 100    | Manager       | 
| 80     | Aiden    | 100    | Engineer      | 
| 190    | Skylar   | 100    | Freelancer    | 
| 196    | Stella   | 101    | Coordinator   |
| 167    | Audrey   | 100    | Consultant    |
| 97     | Nathan   | 101    | Supervisor    |
| 128    | Ian      | 101    | Administrator |
| 81     | Ethan    | 107    | Administrator |
+--------+----------+--------+---------------+
输出
+--------------+--------+
| manager_name | dep_id | 
+--------------+--------+
| Joseph       | 100    | 
| Isabella     | 101    | 
+--------------+--------+
解释
- 部门 ID 为 100 和 101 的每个部门都有 4 名员工，而部门 107 有 3 名员工。由于部门 100 和 101 都拥有相同数量的员工，它们各自的经理将被包括在内。
输出表格按 dep_id 升序排列。

1.2 思路：

不像中等题，有点像中等难度的简单题。

1.3 题解：

sql 复制代码

-- 先找出最大部门的部门id
with tep as (
    select dep_id
    from Employees
    group by dep_id
    having count(*) >= all(
        select count(*)
        from Employees
        group by dep_id
    )
)
-- 然后在最大部门中取寻找Manager即可。
select emp_name manager_name, dep_id
from Employees
where dep_id in (
    select * from tep
)
and position = 'Manager'
order by dep_id

2. 力扣569：员工薪水中位数

2.1 题目：

表: Employee

复制代码

+--------------+---------+
| Column Name  | Type    |
+--------------+---------+
| id           | int     |
| company      | varchar |
| salary       | int     |
+--------------+---------+
id 是该表的主键列(具有唯一值的列)。
该表的每一行表示公司和一名员工的工资。

编写解决方案，找出每个公司的工资中位数。

以 任意顺序 返回结果表。

查询结果格式如下所示。

示例 1:

复制代码

输入: 
Employee 表:
+----+---------+--------+
| id | company | salary |
+----+---------+--------+
| 1  | A       | 2341   |
| 2  | A       | 341    |
| 3  | A       | 15     |
| 4  | A       | 15314  |
| 5  | A       | 451    |
| 6  | A       | 513    |
| 7  | B       | 15     |
| 8  | B       | 13     |
| 9  | B       | 1154   |
| 10 | B       | 1345   |
| 11 | B       | 1221   |
| 12 | B       | 234    |
| 13 | C       | 2345   |
| 14 | C       | 2645   |
| 15 | C       | 2645   |
| 16 | C       | 2652   |
| 17 | C       | 65     |
+----+---------+--------+
输出: 
+----+---------+--------+
| id | company | salary |
+----+---------+--------+
| 5  | A       | 451    |
| 6  | A       | 513    |
| 12 | B       | 234    |
| 9  | B       | 1154   |
| 14 | C       | 2645   |
+----+---------+--------+

**进阶:**你能在不使用任何内置函数或窗口函数的情况下解决它吗?

2.2 思路：

看注释。

分为两种情况，结果union all。

2.3 题解：

sql 复制代码

-- 先给每个人排一个排名
with tep1 as (
    select id, company, salary , row_number() over (partition by company order by salary) ranks
    from Employee 
), tep2 as (
    -- 找到员工个数为奇数的公司
    select company
    from tep1
    group by company
    having count(*) % 2 = 1
)
-- union all讨论两种情况
-- 如果在员工个数是奇数的公司，则选出ranks排名ceil(count(*) / 2)的人
select id, company, salary
from tep1 t1
where company in (
    select * from tep2
) and ranks = (select ceil(count(*) / 2) from tep1 t2 where t1.company = t2.company)

union all

-- 如果在员工个数是偶数的公司，则需要选出两个人选
-- 排名count(*) / 2和count(*) / 2 +1
select id, company, salary
from tep1 t3
where company not in (
    select * from tep2
) 
and ranks = (select count(*) / 2 from tep1 t4 where t3.company = t4.company)
or ranks = (select count(*) / 2 +1 from tep1 t4 where t3.company = t4.company)

3. 力扣1132：报告的记录2

3.1 题目：

动作表： Actions

复制代码

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| user_id       | int     |
| post_id       | int     |
| action_date   | date    |
| action        | enum    |
| extra         | varchar |
+---------------+---------+
这张表可能存在重复的行。
action 列的类型是 ENUM，可能的值为 ('view', 'like', 'reaction', 'comment', 'report', 'share')。
extra 列拥有一些可选信息，例如：报告理由（a reason for report）或反应类型（a type of reaction）等。

移除表： Removals

复制代码

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| post_id       | int     |
| remove_date   | date    | 
+---------------+---------+
这张表的主键是 post_id（具有唯一值的列）。
这张表的每一行表示一个被移除的帖子，原因可能是由于被举报或被管理员审查。

编写解决方案，统计在被报告为垃圾广告的帖子中，被移除的帖子的每日平均占比，四舍五入到小数点后 2 位。

结果的格式如下。

示例 1:

复制代码

输入：
Actions table:
+---------+---------+-------------+--------+--------+
| user_id | post_id | action_date | action | extra  |
+---------+---------+-------------+--------+--------+
| 1       | 1       | 2019-07-01  | view   | null   |
| 1       | 1       | 2019-07-01  | like   | null   |
| 1       | 1       | 2019-07-01  | share  | null   |
| 2       | 2       | 2019-07-04  | view   | null   |
| 2       | 2       | 2019-07-04  | report | spam   |
| 3       | 4       | 2019-07-04  | view   | null   |
| 3       | 4       | 2019-07-04  | report | spam   |
| 4       | 3       | 2019-07-02  | view   | null   |
| 4       | 3       | 2019-07-02  | report | spam   |
| 5       | 2       | 2019-07-03  | view   | null   |
| 5       | 2       | 2019-07-03  | report | racism |
| 5       | 5       | 2019-07-03  | view   | null   |
| 5       | 5       | 2019-07-03  | report | racism |
+---------+---------+-------------+--------+--------+
Removals table:
+---------+-------------+
| post_id | remove_date |
+---------+-------------+
| 2       | 2019-07-20  |
| 3       | 2019-07-18  |
+---------+-------------+
输出：
+-----------------------+
| average_daily_percent |
+-----------------------+
| 75.00                 |
+-----------------------+
解释：
2019-07-04 的垃圾广告移除率是 50%，因为有两张帖子被报告为垃圾广告，但只有一个得到移除。
2019-07-02 的垃圾广告移除率是 100%，因为有一张帖子被举报为垃圾广告并得到移除。
其余几天没有收到垃圾广告的举报，因此平均值为：(50 + 100) / 2 = 75%
注意，输出仅需要一个平均值即可，我们并不关注移除操作的日期。

3.2 思路：

看注释。

3.3 题解：

sql 复制代码

-- 先找出哪些post_id是垃圾信息
with tep as (
    select distinct post_id, action_date
    from Actions
    where extra = 'spam'
), tep1 as (
    -- 以action_date分组
    -- 然后在tep表中找到action_date当天的垃圾信息
    -- 并且where过滤掉没有被移除的记录
    -- 剩下来的都是当天的而且在之后被移除的记录
    select (
        select count(*)
        from (select post_id from tep t2 where t1.action_date = t2.action_date) t
        where t.post_id in (select post_id from Removals)
    ) / count(*) * 100 perc
    from tep t1
    group by action_date
)

-- 再求平均值
select round(avg(perc), 2) average_daily_percent
from tep1

4. 力扣1158：市场分析1

4.1 题目：

表： Users

复制代码

+----------------+---------+
| Column Name    | Type    |
+----------------+---------+
| user_id        | int     |
| join_date      | date    |
| favorite_brand | varchar |
+----------------+---------+
user_id 是此表主键（具有唯一值的列）。
表中描述了购物网站的用户信息，用户可以在此网站上进行商品买卖。

表： Orders

复制代码

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| order_id      | int     |
| order_date    | date    |
| item_id       | int     |
| buyer_id      | int     |
| seller_id     | int     |
+---------------+---------+
order_id 是此表主键（具有唯一值的列）。
item_id 是 Items 表的外键（reference 列）。
（buyer_id，seller_id）是 User 表的外键。

表：Items

复制代码

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| item_id       | int     |
| item_brand    | varchar |
+---------------+---------+
item_id 是此表的主键（具有唯一值的列）。

编写解决方案找出每个用户的注册日期和在 **2019**年作为买家的订单总数。

以 任意顺序 返回结果表。

查询结果格式如下。

示例 1:

复制代码

输入：
Users 表:
+---------+------------+----------------+
| user_id | join_date  | favorite_brand |
+---------+------------+----------------+
| 1       | 2018-01-01 | Lenovo         |
| 2       | 2018-02-09 | Samsung        |
| 3       | 2018-01-19 | LG             |
| 4       | 2018-05-21 | HP             |
+---------+------------+----------------+
Orders 表:
+----------+------------+---------+----------+-----------+
| order_id | order_date | item_id | buyer_id | seller_id |
+----------+------------+---------+----------+-----------+
| 1        | 2019-08-01 | 4       | 1        | 2         |
| 2        | 2018-08-02 | 2       | 1        | 3         |
| 3        | 2019-08-03 | 3       | 2        | 3         |
| 4        | 2018-08-04 | 1       | 4        | 2         |
| 5        | 2018-08-04 | 1       | 3        | 4         |
| 6        | 2019-08-05 | 2       | 2        | 4         |
+----------+------------+---------+----------+-----------+
Items 表:
+---------+------------+
| item_id | item_brand |
+---------+------------+
| 1       | Samsung    |
| 2       | Lenovo     |
| 3       | LG         |
| 4       | HP         |
+---------+------------+
输出：
+-----------+------------+----------------+
| buyer_id  | join_date  | orders_in_2019 |
+-----------+------------+----------------+
| 1         | 2018-01-01 | 1              |
| 2         | 2018-02-09 | 2              |
| 3         | 2018-01-19 | 0              |
| 4         | 2018-05-21 | 0              |
+-----------+------------+----------------+

4.2 思路：

经典的左外连接题。

4.3 题解：

sql 复制代码

-- 先过滤掉order_date除2019年的记录
with tep as (
    select *
    from Orders
    where substring(order_date, 1, 4) = '2019'
)

-- 左外连接，以user_id分组
-- 按理来说group是没有join_date字段的，在select语句是不能使用的
-- 但是由于一个组内的join_date都相等，所以可以用

-- if函数是将左外连接除去内连接的部分的记录
-- 然后将这部分的记录特殊判断，个数作为0，否则作为1，是不符合题意的。
select user_id buyer_id, join_date
,if(buyer_id is null, 0, count(*)) orders_in_2019
from Users t1
left join tep t2
on t1.user_id = t2.buyer_id
group by user_id