面试官:我看到你有考勤系统开发经验,假设现在有个需求:给定某天的员工打卡记录,需要查询每个人当天最早和最晚的打卡时间。你会如何设计SQL查询?
候选人 :这个问题很典型,首先需要明确数据结构和业务场景。假设我们有张attendance_records
表,包含employee_id
、check_time
等字段。核心思路是用聚合函数配合分组查询。
面试官:能具体写个基础查询吗?
候选人:(在记事本上快速书写)
vbnet
SELECT
employee_id,
MIN(check_time) AS earliest,
MAX(check_time) AS latest
FROM attendance_records
WHERE DATE(check_time) = '2023-08-01'
GROUP BY employee_id;
不过这只是基础版,实际开发中可能会遇到三个问题:1)重复打卡处理 2)跨天数据干扰 3)性能问题。
面试官:展开说说重复打卡的问题。
候选人 :比如员工可能在08:00和08:05都有上班打卡。这时候直接用MIN()
取最早是正确的,但若需要关联其他字段比如打卡设备位置,就可能需要子查询定位具体记录。(停顿)比如这样的进阶查询:
vbnet
SELECT
a.employee_id,
a.check_time AS earliest,
b.check_time AS latest
FROM attendance_records a
JOIN (
SELECT employee_id, MIN(check_time) AS min_time
FROM attendance_records
WHERE DATE(check_time) = '2023-08-01'
GROUP BY employee_id
) AS tmp ON a.employee_id = tmp.employee_id AND a.check_time = tmp.min_time
JOIN attendance_records b ON a.employee_id = b.employee_id
GROUP BY a.employee_id;
不过这样写效率可能较低。
面试官:如果数据量达到百万级,怎么优化?
候选人 :我会从三方面入手:1)在check_time
和employee_id
上建组合索引;2)避免使用DATE()
函数转换,改用范围查询;3)分页缓存。比如优化后的查询:
vbnet
SELECT
employee_id,
MIN(check_time) AS earliest,
MAX(check_time) AS latest
FROM attendance_records
WHERE check_time >= '2023-08-01 00:00:00'
AND check_time < '2023-08-02 00:00:00'
GROUP BY employee_id;
这样能利用索引范围扫描,执行计划里的Using index for group-by
会是理想状态。
面试官 :如果遇到ONLY_FULL_GROUP_BY
报错怎么办?比如需要同时显示员工姓名。
候选人:这是个经典陷阱。(PS:这哥们要加害与我)
sql
-- 错误示例
SELECT
employee_id,
employee_name, -- 这里会报错
MIN(check_time),
MAX(check_time)
FROM attendance_records
GROUP BY employee_id;
解决方法有三种:1)把employee_name
加入GROUP BY
;2)用ANY_VALUE(employee_name)
;3)使用子查询关联员工表。比如:
sql
SELECT
a.employee_id,
e.employee_name,
a.earliest,
a.latest
FROM (
SELECT
employee_id,
MIN(check_time) AS earliest,
MAX(check_time) AS latest
FROM attendance_records
WHERE check_time BETWEEN '2023-08-01' AND '2023-08-02'
GROUP BY employee_id
) a
JOIN employees e ON a.employee_id = e.id;
这种写法既符合规范,又避免全表扫描。
面试官:如果某个员工当天只有一次打卡,系统应该如何判定?
候选人:这属于业务逻辑范畴。根据之前的需求,我们约定:单次打卡既计为上班也计为下班,但会打标记。在SQL中可以这样体现:
sql
SELECT
employee_id,
MIN(check_time) AS earliest,
MAX(check_time) AS latest,
CASE WHEN COUNT(*) = 1 THEN 1 ELSE 0 END AS is_single_check
FROM attendance_records
...
前端会根据is_single_check
字段显示特殊标识,同时触发考勤异常提醒。
面试官:如果要求实时计算全公司最早到岗TOP3员工呢?
候选人:需要窗口函数配合。比如:
sql
WITH ranked_data AS (
SELECT
employee_id,
MIN(check_time) OVER (PARTITION BY employee_id) AS earliest,
ROW_NUMBER() OVER (ORDER BY MIN(check_time)) AS rank
FROM attendance_records
WHERE DATE(check_time) = '2023-08-01'
GROUP BY employee_id
)
SELECT * FROM ranked_data WHERE rank <= 3;
不过要注意,多人同一时间打卡时可能需要用DENSE_RANK
。
面试官:如果开发时说功能正常,但测试反馈结果中有人漏统计,你会如何排查?
候选人 :我会走四步排查法:1)检查时区设置,遇到过UTC转本地时间导致的日期错位;2)查看employee_id
是否存在空格或特殊字符;3)用EXPLAIN
分析是否走到正确索引;4)手动执行测试用例,比如:
sql
-- 插入测试数据
INSERT INTO attendance_records
(employee_id, check_time) VALUES
(999, '2023-08-01 07:30:00'),
(999, '2023-08-01 18:15:00');
-- 验证查询
SELECT * FROM (
SELECT employee_id, MIN(check_time)...
) WHERE employee_id = 999;
通过这种白盒测试能快速定位是数据问题还是逻辑问题。
面试官:很扎实的技术思考,尤其是对异常场景的预判。能否明天来和CTO做终面?
候选人:当然,期待进一步交流系统架构层面的优化方案。
模拟数据与测试脚本
以下是一个完整的MySQL脚本,创建考勤表并插入10名员工在2023年5月(31天)的随机打卡数据,包括工作日和周末的打卡记录。

sql
-- 创建考勤记录表(如果不存在)
CREATE TABLE IF NOT EXISTS attendance_records (
id INT AUTO_INCREMENT PRIMARY KEY,
employee_id INT NOT NULL COMMENT '员工ID',
employee_name VARCHAR(50) NOT NULL COMMENT '员工姓名',
check_time DATETIME NOT NULL COMMENT '打卡时间',
check_type VARCHAR(10) COMMENT '打卡类型(IN-上班, OUT-下班)',
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP COMMENT '记录创建时间'
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='员工考勤打卡记录表';
-- 清空现有数据(如果表已存在)
TRUNCATE TABLE attendance_records;
-- 插入10名员工一个月的打卡数据(2023年5月)
DELIMITER //
CREATE PROCEDURE GenerateAttendanceData()
BEGIN
DECLARE i INT DEFAULT 0;
DECLARE j INT DEFAULT 0;
DECLARE emp_id INT;
DECLARE emp_name VARCHAR(50);
DECLARE check_date DATE;
DECLARE weekday INT;
DECLARE check_in_time DATETIME;
DECLARE check_out_time DATETIME;
DECLARE lunch_start DATETIME;
DECLARE lunch_end DATETIME;
-- 10名员工数据
DECLARE employees VARCHAR(1000) DEFAULT '101,张三,102,李四,103,王五,104,赵六,105,钱七,106,孙八,107,周九,108,吴十,109,郑十一,110,王十二';
DECLARE emp_ids VARCHAR(100) DEFAULT '101,102,103,104,105,106,107,108,109,110';
DECLARE emp_names VARCHAR(1000) DEFAULT '张三,李四,王五,赵六,钱七,孙八,周九,吴十,郑十一,王十二';
-- 为每个员工生成数据
WHILE i < 10 DO
SET emp_id = SUBSTRING_INDEX(SUBSTRING_INDEX(emp_ids, ',', i+1), ',', -1);
SET emp_name = SUBSTRING_INDEX(SUBSTRING_INDEX(emp_names, ',', i+1), ',', -1);
-- 生成5月份每天的数据(1-31日)
SET j = 1;
WHILE j <= 31 DO
SET check_date = DATE(CONCAT('2023-05-', LPAD(j, 2, '0')));
SET weekday = DAYOFWEEK(check_date); -- 1=周日, 2=周一,...,7=周六
-- 只生成工作日数据(周一到周五)
IF weekday BETWEEN 2 AND 6 THEN
-- 上班打卡时间(08:00-09:30之间随机)
SET check_in_time = TIMESTAMPADD(
MINUTE,
FLOOR(RAND() * 90),
TIMESTAMP(check_date, '08:00:00')
);
-- 午餐开始时间(12:00-12:30之间随机)
SET lunch_start = TIMESTAMPADD(
MINUTE,
FLOOR(RAND() * 30),
TIMESTAMP(check_date, '12:00:00')
);
-- 午餐结束时间(13:00-14:00之间随机)
SET lunch_end = TIMESTAMPADD(
MINUTE,
FLOOR(RAND() * 60),
TIMESTAMP(check_date, '13:00:00')
);
-- 下班打卡时间(17:30-19:30之间随机)
SET check_out_time = TIMESTAMPADD(
MINUTE,
FLOOR(RAND() * 120),
TIMESTAMP(check_date, '17:30:00')
);
-- 插入上班打卡记录(可能有多次打卡)
INSERT INTO attendance_records (employee_id, employee_name, check_time, check_type)
VALUES (emp_id, emp_name, check_in_time, 'IN');
-- 30%概率有第二次上班打卡(忘记打卡又补打)
IF RAND() < 0.3 THEN
INSERT INTO attendance_records (employee_id, employee_name, check_time, check_type)
VALUES (emp_id, emp_name, TIMESTAMPADD(MINUTE, FLOOR(RAND() * 10), check_in_time), 'IN');
END IF;
-- 插入午餐打卡记录
INSERT INTO attendance_records (employee_id, employee_name, check_time, check_type)
VALUES (emp_id, emp_name, lunch_start, 'OUT');
INSERT INTO attendance_records (employee_id, employee_name, check_time, check_type)
VALUES (emp_id, emp_name, lunch_end, 'IN');
-- 插入下班打卡记录
INSERT INTO attendance_records (employee_id, employee_name, check_time, check_type)
VALUES (emp_id, emp_name, check_out_time, 'OUT');
-- 20%概率有第二次下班打卡
IF RAND() < 0.2 THEN
INSERT INTO attendance_records (employee_id, employee_name, check_time, check_type)
VALUES (emp_id, emp_name, TIMESTAMPADD(MINUTE, FLOOR(RAND() * 15), check_out_time), 'OUT');
END IF;
-- 10%概率忘记午餐打卡
IF RAND() < 0.1 THEN
DELETE FROM attendance_records
WHERE employee_id = emp_id
AND DATE(check_time) = check_date
AND check_type IN ('OUT', 'IN')
AND check_time BETWEEN lunch_start AND lunch_end
LIMIT 1;
END IF;
-- 5%概率全天只有一次打卡(忘记其他打卡)
IF RAND() < 0.05 THEN
DELETE FROM attendance_records
WHERE employee_id = emp_id
AND DATE(check_time) = check_date
AND id NOT IN (
SELECT id FROM (
SELECT id FROM attendance_records
WHERE employee_id = emp_id
AND DATE(check_time) = check_date
ORDER BY check_time
LIMIT 1
) AS t
);
END IF;
END IF;
SET j = j + 1;
END WHILE;
SET i = i + 1;
END WHILE;
END //
DELIMITER ;
-- 执行存储过程生成数据
CALL GenerateAttendanceData();
-- 删除存储过程
DROP PROCEDURE IF EXISTS GenerateAttendanceData;
-- 查询生成的数据量
SELECT COUNT(*) AS total_records FROM attendance_records;
-- 查询某员工某天的打卡记录示例
SELECT * FROM attendance_records
WHERE employee_id = 101
AND DATE(check_time) = '2023-05-15'
ORDER BY check_time;
-- 查询每个人每天的考勤情况(最早和最晚打卡)
SELECT
ar.employee_id,
ar.employee_name,
DATE(ar.check_time) AS attendance_date,
DAYNAME(DATE(ar.check_time)) AS day_of_week, -- ✅ 关键修改点
MIN(ar.check_time) AS first_check_in,
MAX(ar.check_time) AS last_check_out,
TIMESTAMPDIFF(MINUTE, MIN(ar.check_time), MAX(ar.check_time)) AS total_minutes,
COUNT(*) AS total_checks
FROM
attendance_records ar
WHERE
DATE(ar.check_time) BETWEEN '2023-05-01' AND '2023-05-31'
GROUP BY
ar.employee_id,
ar.employee_name,
DATE(ar.check_time), -- ✅ 确保分组包含日期
DAYNAME(DATE(ar.check_time)) -- ✅ 新增分组字段
ORDER BY
ar.employee_id, attendance_date;