SQL面试题练习 —— 合并用户浏览行为

目录

  • [1 题目](#1 题目)
  • [2 建表语句](#2 建表语句)
  • [3 题解](#3 题解)

1 题目

有一份用户访问记录表,记录用户id和访问时间,如果用户访问时间间隔小于60s则认为时一次浏览,请合并用户的浏览行为。

样例数据

复制代码
+----------+--------------+
| user_id  | access_time  |
+----------+--------------+
| 1        | 1736337600   |
| 1        | 1736337660   |
| 2        | 1736337670   |
| 1        | 1736337710   |
| 3        | 1736337715   |
| 2        | 1736337750   |
| 1        | 1736337760   |
| 3        | 1736337820   |
| 2        | 1736337850   |
| 1        | 1736337910   |
+----------+--------------+

2 建表语句

sql 复制代码
--建表语句
CREATE TABLE user_access_log (
  user_id INT,
  access_time BIGINT
) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
--插入数据
insert into user_access_log (user_id,access_time)
values
(1,1736337600),
(1,1736337660),
(2,1736337670),
(1,1736337710),
(3,1736337715),
(2,1736337750),
(1,1736337760),
(3,1736337820),
(2,1736337850),
(1,1736337910);

3 题解

(1)分用户计算出每次点击时间差;

sql 复制代码
select user_id,
       access_time,
       last_access_time,
       access_time - last_access_time as time_diff
from (select user_id,
             access_time,
             lag(access_time) over (partition by user_id order by access_time) as last_access_time
      from user_access_log) t

执行结果

复制代码
+----------+--------------+-------------------+------------+
| user_id  | access_time  | last_access_time  | time_diff  |
+----------+--------------+-------------------+------------+
| 1        | 1736337600   | NULL              | NULL       |
| 1        | 1736337660   | 1736337600        | 60         |
| 1        | 1736337710   | 1736337660        | 50         |
| 1        | 1736337760   | 1736337710        | 50         |
| 1        | 1736337910   | 1736337760        | 150        |
| 2        | 1736337670   | NULL              | NULL       |
| 2        | 1736337750   | 1736337670        | 80         |
| 2        | 1736337850   | 1736337750        | 100        |
| 3        | 1736337715   | NULL              | NULL       |
| 3        | 1736337820   | 1736337715        | 105        |
+----------+--------------+-------------------+------------+

(2)确认是否是新的访问

sql 复制代码
select user_id,
       access_time,
       last_access_time,
       if(access_time - last_access_time >= 60, 1, 0) as is_new_group
from (select user_id,
             access_time,
             lag(access_time) over (partition by user_id order by access_time) as last_access_time
      from user_access_log) t

执行结果

复制代码
+----------+--------------+-------------------+---------------+
| user_id  | access_time  | last_access_time  | is_new_group  |
+----------+--------------+-------------------+---------------+
| 1        | 1736337600   | NULL              | 0             |
| 1        | 1736337660   | 1736337600        | 1             |
| 1        | 1736337710   | 1736337660        | 0             |
| 1        | 1736337760   | 1736337710        | 0             |
| 1        | 1736337910   | 1736337760        | 1             |
| 2        | 1736337670   | NULL              | 0             |
| 2        | 1736337750   | 1736337670        | 1             |
| 2        | 1736337850   | 1736337750        | 1             |
| 3        | 1736337715   | NULL              | 0             |
| 3        | 1736337820   | 1736337715        | 1             |
+----------+--------------+-------------------+---------------+

(3)得出结果

使用sum()over(partition by ...... order by ......)累加计算,给出组ID。聚合函数开窗使用order by 计算结果是从分组开始计算到当前行的结果。

这里的技巧:需要新建组的时候就给标签赋值1,否则0,然后累加计算结果在新建组的时候值就会变化,根据聚合值分组,得到合并结果。

sql 复制代码
with t_group as
         (select user_id,
                 access_time,
                 last_access_time,
                 if(access_time - last_access_time >= 60, 1, 0) as is_new_group
          from (select user_id,
                       access_time,
                       lag(access_time) over (partition by user_id order by access_time) as last_access_time
                from user_access_log) t)
select user_id,
       access_time,
       last_access_time,
       is_new_group,
       sum(is_new_group) over (partition by user_id order by access_time asc) as group_id
from t_group

执行结果

复制代码
+----------+--------------+-------------------+---------------+-----------+
| user_id  | access_time  | last_access_time  | is_new_group  | group_id  |
+----------+--------------+-------------------+---------------+-----------+
| 1        | 1736337600   | NULL              | 0             | 0         |
| 1        | 1736337660   | 1736337600        | 1             | 1         |
| 1        | 1736337710   | 1736337660        | 0             | 1         |
| 1        | 1736337760   | 1736337710        | 0             | 1         |
| 1        | 1736337910   | 1736337760        | 1             | 2         |
| 2        | 1736337670   | NULL              | 0             | 0         |
| 2        | 1736337750   | 1736337670        | 1             | 1         |
| 2        | 1736337850   | 1736337750        | 1             | 2         |
| 3        | 1736337715   | NULL              | 0             | 0         |
| 3        | 1736337820   | 1736337715        | 1             | 1         |
+----------+--------------+-------------------+---------------+-----------+
相关推荐
码农小卡拉2 小时前
深入解析Spring Boot文件加载顺序与加载方式
java·数据库·spring boot
怣502 小时前
MySQL多表连接:全外连接、交叉连接与结果集合并详解
数据库·sql
wjhx2 小时前
QT中对蓝牙权限的申请,整理一下
java·数据库·qt
冰暮流星2 小时前
javascript之二重循环练习
开发语言·javascript·数据库
万岳科技系统开发3 小时前
食堂采购系统源码库存扣减算法与并发控制实现详解
java·前端·数据库·算法
冉冰学姐3 小时前
SSM智慧社区管理系统jby69(程序+源码+数据库+调试部署+开发环境)带论文文档1万字以上,文末可获取,系统界面在最后面
数据库·管理系统·智慧社区·ssm 框架
杨超越luckly3 小时前
HTML应用指南:利用GET请求获取中国500强企业名单,揭秘企业增长、分化与转型的新常态
前端·数据库·html·可视化·中国500强
Elastic 中国社区官方博客3 小时前
Elasticsearch:Workflows 介绍 - 9.3
大数据·数据库·人工智能·elasticsearch·ai·全文检索
仍然.3 小时前
MYSQL--- 聚合查询,分组查询和联合查询
数据库
一 乐3 小时前
校园二手交易|基于springboot + vue校园二手交易系统(源码+数据库+文档)
java·数据库·vue.js·spring boot·后端