1. 窗口函数(Window Functions)
用于在结果集的"窗口"(指定行范围)内执行计算,保留原数据行的同时生成聚合或排序结果。
1.1 核心语法
SELECT
column1,
column2,
[窗口函数] OVER (
PARTITION BY 分组列
ORDER BY 排序列
[ROWS/RANGE 范围定义]
) AS 别名
FROM 表名;
1.2 常用窗口函数
-
排序类:
ROW_NUMBER() -- 行号(唯一,无重复) RANK() -- 排名(允许并列,后续跳过序号) DENSE_RANK() -- 密集排名(允许并列,后续不跳号)
示例:按分数对学生排名
SELECT name, score, ROW_NUMBER() OVER (ORDER BY score DESC) AS row_num, RANK() OVER (ORDER BY score DESC) AS rank, DENSE_RANK() OVER (ORDER BY score DESC) AS dense_rank FROM students;
-
聚合类:
SUM() OVER (PARTITION BY 分组列) -- 分组求和 AVG() OVER (ORDER BY 排序列 ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) -- 滑动平均
示例:计算每个学生的累计总分
SELECT name, score, SUM(score) OVER (ORDER BY enroll_date) AS cumulative_sum FROM students;
-
分布类:
LAG(column, n) -- 获取前第n行的值 LEAD(column, n) -- 获取后第n行的值 NTILE(4) -- 将数据分为4组
2. 存储过程与函数
2.1 存储过程(Stored Procedure)
封装复杂逻辑,可重复调用:
-- 创建存储过程(示例:根据年龄筛选学生)
DELIMITER //
CREATE PROCEDURE GetStudentsByAge(IN min_age INT, IN max_age INT)
BEGIN
SELECT name, age
FROM students
WHERE age BETWEEN min_age AND max_age;
END //
DELIMITER ;
-- 调用存储过程
CALL GetStudentsByAge(18, 25);
2.2 自定义函数(User-Defined Function)
返回单一值,可在查询中使用:
-- 创建函数(示例:计算折扣价)
CREATE FUNCTION CalculateDiscount(price DECIMAL(10,2), discount_rate DECIMAL(3,2))
RETURNS DECIMAL(10,2)
DETERMINISTIC
BEGIN
RETURN price * (1 - discount_rate);
END;
-- 使用函数
SELECT product_name, price, CalculateDiscount(price, 0.1) AS discounted_price
FROM products;
3. 触发器(Triggers)
在指定事件(INSERT/UPDATE/DELETE)前后自动执行:
-- 创建触发器(示例:在插入订单时更新库存)
CREATE TRIGGER UpdateInventoryAfterOrder
AFTER INSERT ON orders
FOR EACH ROW
BEGIN
UPDATE products
SET stock = stock - NEW.quantity
WHERE product_id = NEW.product_id;
END;
-- 插入订单后,库存自动减少
INSERT INTO orders (product_id, quantity) VALUES (101, 5);
4. 动态SQL
构建灵活查询(示例:根据条件动态筛选):
-- 使用预处理语句(MySQL示例)
SET @sql = CONCAT('SELECT * FROM students WHERE 1=1');
-- 动态添加条件
IF age_filter IS NOT NULL THEN
SET @sql = CONCAT(@sql, ' AND age = ', age_filter);
END IF;
PREPARE stmt FROM @sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
5. JSON 数据处理
现代数据库支持 JSON 类型和操作:
-- 创建含 JSON 列的表
CREATE TABLE user_profiles (
user_id INT PRIMARY KEY,
profile JSON
);
-- 插入 JSON 数据
INSERT INTO user_profiles VALUES (1, '{"name": "Alice", "hobbies": ["reading", "music"]}');
-- 查询 JSON 字段
SELECT
user_id,
profile->'$.name' AS name,
JSON_EXTRACT(profile, '$.hobbies[0]') AS first_hobby
FROM user_profiles;
-- 更新 JSON 字段
UPDATE user_profiles
SET profile = JSON_SET(profile, '$.age', 25)
WHERE user_id = 1;
6. 性能优化技巧
6.1 分析执行计划
使用 EXPLAIN
查看查询优化路径:
EXPLAIN SELECT * FROM students WHERE age > 20;
6.2 索引优化
- 覆盖索引:索引包含查询所需的所有字段。
- 避免全表扫描:对 WHERE 和 JOIN 中的列建索引。
- 复合索引顺序:将高区分度的列放在前面。
6.3 避免隐式类型转换
-- 错误示例(将数字与字符串比较)
SELECT * FROM students WHERE id = '100';
-- 正确写法
SELECT * FROM students WHERE id = 100;
6.4 分页查询优化
避免 LIMIT offset, size
在大偏移量时的性能问题:
-- 优化写法(基于有序唯一列)
SELECT * FROM students
WHERE id > 1000 -- 上次查询的最大ID
ORDER BY id
LIMIT 10;
7. 递归查询(CTE)
处理层次结构数据(如组织结构):
-- 查询所有下属员工(示例表:employee(id, name, manager_id))
WITH RECURSIVE Subordinates AS (
SELECT id, name, manager_id
FROM employee
WHERE id = 1 -- 根节点(CEO)
UNION ALL
SELECT e.id, e.name, e.manager_id
FROM employee e
INNER JOIN Subordinates s ON e.manager_id = s.id
)
SELECT * FROM Subordinates;
8. 视图与物化视图
8.1 视图(View)
虚拟表,简化复杂查询:
CREATE VIEW HighScoreStudents AS
SELECT name, score
FROM students
WHERE score >= 90;
-- 使用视图
SELECT * FROM HighScoreStudents;
8.2 物化视图(Materialized View)
物理存储查询结果(需数据库支持,如 PostgreSQL):
CREATE MATERIALIZED VIEW SalesSummary AS
SELECT product_id, SUM(quantity) AS total_sales
FROM orders
GROUP BY product_id;
-- 刷新物化视图
REFRESH MATERIALIZED VIEW SalesSummary;
9. 综合实战:电商数据分析
-- 统计每个用户的订单数、总金额及最近购买时间
SELECT
u.user_id,
u.name,
COUNT(o.order_id) AS order_count,
SUM(o.amount) AS total_amount,
MAX(o.order_date) AS last_order_date
FROM users u
LEFT JOIN orders o ON u.user_id = o.user_id
GROUP BY u.user_id;
-- 分析销售额的月度增长趋势(窗口函数)
SELECT
DATE_FORMAT(order_date, '%Y-%m') AS month,
SUM(amount) AS monthly_sales,
LAG(SUM(amount)) OVER (ORDER BY DATE_FORMAT(order_date, '%Y-%m')) AS prev_month_sales,
(SUM(amount) - LAG(SUM(amount)) OVER (ORDER BY DATE_FORMAT(order_date, '%Y-%m'))) / LAG(SUM(amount)) OVER (ORDER BY DATE_FORMAT(order_date, '%Y-%m')) * 100 AS growth_rate
FROM orders
GROUP BY month;
10. 本章练习
-
使用窗口函数计算每个学生的分数与班级平均分的差值。
-
编写存储过程:根据用户ID删除订单,并自动退还库存。
-
优化以下分页查询(假设表含百万数据):
SELECT * FROM orders ORDER BY order_date DESC LIMIT 100000, 10;