196. 删除重复的电子邮箱
题目:编写解决方案删除 所有重复的电子邮件,只保留一个具有最小 id
的唯一电子邮件。
(对于 SQL 用户,请注意你应该编写一个 DELETE
语句而不是 SELECT
语句。)
(对于 Pandas 用户,请注意你应该直接修改 Person
表。)
数据准备:
sql
Create table If Not Exists Person (Id int, Email varchar(255))
Truncate table Person
insert into Person (id, email) values ('1', 'john@example.com')
insert into Person (id, email) values ('2', 'bob@example.com')
insert into Person (id, email) values ('3', 'john@example.com')
代码实现:
sql
delete from Person where id not in (
select a.id from (select min(id) id from Person group by email)a);
197. 上升的温度
题目:编写解决方案,找出与之前(昨天的)日期相比温度更高的所有日期的 id
。
数据准备:
sql
Create table If Not Exists Weather (id int, recordDate date, temperature int)
Truncate table Weather
insert into Weather (id, recordDate, temperature) values ('1', '2015-01-01', '10')
insert into Weather (id, recordDate, temperature) values ('2', '2015-01-02', '25')
insert into Weather (id, recordDate, temperature) values ('3', '2015-01-03', '20')
insert into Weather (id, recordDate, temperature) values ('4', '2015-01-04', '30')
思路:
|----|------------|-------------|-------------|-------------------------|----------------------------------|
| | | | weather1 | | |
| | | | | date_add得出前一天的日期 | |
| | id | recordDate | temperature | yesterday | |
| | 1 | 2015/1/1 | 10 | 2014/12/31 | |
| | 2 | 2015/1/2 | 25 | 2015/1/1 | |
| | 3 | 2015/1/3 | 20 | 2015/1/2 | |
| | 4 | 2015/1/4 | 30 | 2015/1/3 | |
| | | | | | |
| | | | | | |
| | | weather1与weather表join连接 ||| |
| | | | | | |
| id | recordDate | temperature | yesterday | yesterday日期的temperature | 今日的temperature是否大于昨日的temoerature |
| 1 | 2015/1/1 | 10 | 2014/12/31 | | |
| 2 | 2015/1/2 | 25 | 2015/1/1 | 10 | 1 |
| 3 | 2015/1/3 | 20 | 2015/1/2 | 25 | 0 |
| 4 | 2015/1/4 | 30 | 2015/1/3 | 20 | 1 |
代码实现:
sql
with t1 as (select id,temperature,recordDate,date_sub(recordDate,interval 1 day) yesterday from weather)
select t1.id from t1 join weather on Weather.recordDate=t1.yesterday
where t1.temperature>Weather.temperature;
262. 行程和用户
题目:编写解决方案找出 "2013-10-01"
至 "2013-10-03"
期间非禁止用户(乘客和司机都必须未被禁止 )的取消率。非禁止用户即 banned 为 No 的用户,禁止用户即 banned 为 Yes 的用户。其中取消率 Cancellation Rate
需要四舍五入保留 两位小数
取消率 的计算方式如下:(被司机或乘客取消的非禁止用户生成的订单数量) / (非禁止用户生成的订单总数)。
数据准备:
sql
Create table If Not Exists Trips (id int, client_id int, driver_id int, city_id int, status ENUM('completed', 'cancelled_by_driver', 'cancelled_by_client'), request_at varchar(50))
Create table If Not Exists Users (users_id int, banned varchar(50), role ENUM('client', 'driver', 'partner'))
Truncate table Trips
insert into Trips (id, client_id, driver_id, city_id, status, request_at) values ('1', '1', '10', '1', 'completed', '2013-10-01')
insert into Trips (id, client_id, driver_id, city_id, status, request_at) values ('2', '2', '11', '1', 'cancelled_by_driver', '2013-10-01')
insert into Trips (id, client_id, driver_id, city_id, status, request_at) values ('3', '3', '12', '6', 'completed', '2013-10-01')
insert into Trips (id, client_id, driver_id, city_id, status, request_at) values ('4', '4', '13', '6', 'cancelled_by_client', '2013-10-01')
insert into Trips (id, client_id, driver_id, city_id, status, request_at) values ('5', '1', '10', '1', 'completed', '2013-10-02')
insert into Trips (id, client_id, driver_id, city_id, status, request_at) values ('6', '2', '11', '6', 'completed', '2013-10-02')
insert into Trips (id, client_id, driver_id, city_id, status, request_at) values ('7', '3', '12', '6', 'completed', '2013-10-02')
insert into Trips (id, client_id, driver_id, city_id, status, request_at) values ('8', '2', '12', '12', 'completed', '2013-10-03')
insert into Trips (id, client_id, driver_id, city_id, status, request_at) values ('9', '3', '10', '12', 'completed', '2013-10-03')
insert into Trips (id, client_id, driver_id, city_id, status, request_at) values ('10', '4', '13', '12', 'cancelled_by_driver', '2013-10-03')
Truncate table Users
insert into Users (users_id, banned, role) values ('1', 'No', 'client')
insert into Users (users_id, banned, role) values ('2', 'Yes', 'client')
insert into Users (users_id, banned, role) values ('3', 'No', 'client')
insert into Users (users_id, banned, role) values ('4', 'No', 'client')
insert into Users (users_id, banned, role) values ('10', 'No', 'driver')
insert into Users (users_id, banned, role) values ('11', 'No', 'driver')
insert into Users (users_id, banned, role) values ('12', 'No', 'driver')
insert into Users (users_id, banned, role) values ('13', 'No', 'driver')
代码实现:
sql
with t as (select * from Trips where
client_id in (select users_id from Users where banned='no')
and driver_id in (select users_id from Users where banned='no'))
select request_at day,round(sum(if(status != 'completed',1,0))/count(id),2) 'Cancellation Rate' from t
where request_at between '2013-10-01' and '2013-10-03' group by request_at;
511. 游戏玩法1
问题:查询每位玩家 第一次登录平台的日期。
数据准备:
sql
Create table If Not Exists Activity (player_id int, device_id int, event_date date, games_played int)
Truncate table Activity
insert into Activity (player_id, device_id, event_date, games_played) values ('1', '2', '2016-03-01', '5')
insert into Activity (player_id, device_id, event_date, games_played) values ('1', '2', '2016-05-02', '6')
insert into Activity (player_id, device_id, event_date, games_played) values ('2', '3', '2017-06-25', '1')
insert into Activity (player_id, device_id, event_date, games_played) values ('3', '1', '2016-03-02', '0')
insert into Activity (player_id, device_id, event_date, games_played) values ('3', '4', '2018-07-03', '5')
代码实现:
sql
select player_id,min(event_date) from activity group by player_id;
512. 游戏玩法2
问题·:请编写解决方案,描述每一个玩家首次登陆的设备名称
数据准备:
sql
Create table If Not Exists Activity (player_id int, device_id int, event_date date, games_played int)
Truncate table Activity;
insert into Activity (player_id, device_id, event_date, games_played) values ('1', '2', '2016-03-01', '5')
insert into Activity (player_id, device_id, event_date, games_played) values ('1', '2', '2016-05-02', '6')
insert into Activity (player_id, device_id, event_date, games_played) values ('2', '3', '2017-06-25', '1')
insert into Activity (player_id, device_id, event_date, games_played) values ('3', '1', '2016-03-02', '0')
insert into Activity (player_id, device_id, event_date, games_played) values ('3', '4', '2018-07-03', '5')
代码实现:
sql
with t1 as (select player_id player_id1,min(event_date) event_date1 from activity group by player_id)
select player_id,device_id from activity join t1 on Activity.player_id=t1.player_id1
where event_date=t1.event_date1;
513. 游戏玩法3
问题:编写一个解决方案,同时报告每组玩家和日期,以及玩家到 目前为止 玩了多少游戏。 也就是说,玩家在该日期之前所玩的游戏总数。详细情况请查看示例。
数据准备:
sql
Create table If Not Exists Activity (player_id int, device_id int, event_date date, games_played int)
Truncate table Activity
insert into Activity (player_id, device_id, event_date, games_played) values ('1', '2', '2016-03-01', '5')
insert into Activity (player_id, device_id, event_date, games_played) values ('1', '2', '2016-03-02', '6')
insert into Activity (player_id, device_id, event_date, games_played) values ('2', '3', '2017-06-25', '1')
insert into Activity (player_id, device_id, event_date, games_played) values ('3', '1', '2016-03-02', '0')
insert into Activity (player_id, device_id, event_date, games_played) values ('3', '4', '2018-07-03', '5');
代码实现:
sql
select player_id,
sum(games_played)over(partition by player_id order by event_date rows between unbounded preceding and current row) con
from activity;
514. 游戏玩法4
问题:报告在首次登录的第二天再次登录的玩家的 比率,四舍五入到小数点后两位。换句话说,你需要计算从首次登录日期开始至少连续两天登录的玩家的数量,然后除以玩家总数。
数据准备:
sql
Create table If Not Exists Activity (player_id int, device_id int, event_date date, games_played int)
Truncate table Activity
insert into Activity (player_id, device_id, event_date, games_played) values ('1', '2', '2016-03-01', '5')
insert into Activity (player_id, device_id, event_date, games_played) values ('1', '2', '2016-03-02', '6')
insert into Activity (player_id, device_id, event_date, games_played) values ('2', '3', '2017-06-25', '1')
insert into Activity (player_id, device_id, event_date, games_played) values ('3', '1', '2016-03-02', '0')
insert into Activity (player_id, device_id, event_date, games_played) values ('3', '4', '2018-07-03', '5');
思路:
|-----------|-----------|------------|--------------|----------|-----------|-----------|------------|--------------|-----------|
| | | | | | 筛选出各玩家第一次登录的记录 |||| |
| player_id | device_id | event_date | games_played | | player_id | device_id | event_date | games_played | 第二天 |
| 1 | 2 | 2016/3/1 | 5 | | 1 | 2 | 2016/3/1 | 5 | 2016/3/2 |
| 1 | 2 | 2016/3/2 | 6 | | 2 | 3 | 2017/6/25 | 1 | 2017/6/26 |
| 2 | 3 | 2017/6/25 | 1 | | 3 | 1 | 2016/3/2 | 0 | 2016/3/3 |
| 3 | 1 | 2016/3/2 | 0 | | | | | | |
| 3 | 4 | 2018/7/3 | 5 | | | | | | |
| | | | | | | | | | |
| 两张表内连接,player_id相等且第一次登录的第二天的时间等于原表中该用户的登录时间 ||||||||| |
| player_id | device_id | event_date | games_played | 第二天 | player_id | device_id | event_date | games_played | |
| 1 | 2 | 2016/3/1 | 5 | 2016/3/2 | 1 | 2 | 2016/3/2 | 6 | |
代码实现:
t1 筛选出每个玩家第一次登陆的记录
t2 在该记录的后面添加一列第二天列
t3 将表连接,筛选出玩家有第二天登录的记录
最后求出比值
sql
with t1 as (select player_id player_id1,min(event_date) event_date1 from activity group by player_id)
,t2 as (select *,date_add(event_date1,interval 1 day) date from t1)
,t3 as (select player_id1 from t2 join activity
on t2.player_id1=Activity.player_id and t2.date=Activity.event_date)
select round((select count(distinct player_id1) from t3)/(select count(distinct player_id) from activity),2) as fraction
;
569. 员工薪水中位数
问题:编写解决方案,找出每个公司的工资中位数。
数据准备:
sql
Create table If Not Exists Employee (id int, company varchar(255), salary int)
Truncate table Employee
insert into Employee (id, company, salary) values ('1', 'A', '2341')
insert into Employee (id, company, salary) values ('2', 'A', '341')
insert into Employee (id, company, salary) values ('3', 'A', '15')
insert into Employee (id, company, salary) values ('4', 'A', '15314')
insert into Employee (id, company, salary) values ('5', 'A', '451')
insert into Employee (id, company, salary) values ('6', 'A', '513')
insert into Employee (id, company, salary) values ('7', 'B', '15')
insert into Employee (id, company, salary) values ('8', 'B', '13')
insert into Employee (id, company, salary) values ('9', 'B', '1154')
insert into Employee (id, company, salary) values ('10', 'B', '1345')
insert into Employee (id, company, salary) values ('11', 'B', '1221')
insert into Employee (id, company, salary) values ('12', 'B', '234')
insert into Employee (id, company, salary) values ('13', 'C', '2345')
insert into Employee (id, company, salary) values ('14', 'C', '2645')
insert into Employee (id, company, salary) values ('15', 'C', '2645')
insert into Employee (id, company, salary) values ('16', 'C', '2652')
insert into Employee (id, company, salary) values ('17', 'C', '65');
代码实现:
t1 对所有员工的工资进行连续不并列排序
t2 找出有偶数个员工的公司的员工工资
t3 找出有奇数个员工的公司的员工工资
联合t2和t3
sql
with t1 as (select *,row_number() over (partition by company order by salary)ran from employee)
,t2 as (select company,round(max(ran)/2,0)rou1,round(max(ran)/2+1,0)rou2 from t1 group by company having max(ran)%2=0)
,t3 as (select company,round(max(ran)/2+0.5,0) rou from t1 group by company having max(ran)%2=1)
select t1.id,t1.company,t1.salary from t1 join t2 on t1.company=t2.company where ran in (rou1,rou2)
union all
select t1.id,t1.company,t1.salary from t1 join t3 on t1.company=t3.company where ran = rou;