SQL复杂查询与性能优化:医药行业ERP系统实战指南
一、医药行业数据库特性分析
在医药ERP系统中,数据库通常包含以下核心表结构:
sql
-- 药品主数据
CREATE TABLE drug_master (
drug_id INT PRIMARY KEY,
drug_name VARCHAR(255),
specification VARCHAR(100),
batch_number VARCHAR(50),
expiration_date DATE,
storage_conditions TEXT,
approval_number VARCHAR(50) UNIQUE
);
-- 库存管理
CREATE TABLE inventory (
inventory_id INT PRIMARY KEY,
drug_id INT,
warehouse_id INT,
quantity DECIMAL(10,2),
last_updated TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (drug_id) REFERENCES drug_master(drug_id)
);
-- 订单管理
CREATE TABLE orders (
order_id INT PRIMARY KEY,
customer_id INT,
order_date DATE,
total_amount DECIMAL(10,2),
status ENUM('pending','shipped','delivered','returned')
);
-- 物流运输
CREATE TABLE logistics (
logistics_id INT PRIMARY KEY,
order_id INT,
carrier VARCHAR(50),
tracking_number VARCHAR(100),
departure_date DATE,
arrival_date DATE,
FOREIGN KEY (order_id) REFERENCES orders(order_id)
);
医药数据具有以下特点:
- 数据敏感性:包含患者隐私、药品专利信息等敏感数据
- 严格时效性:药品有效期管理要求精确到分钟
- 复杂关联关系:药品批次与库存、订单、物流的多对多关系
- 合规性要求:需符合FDA 21 CFR Part 11、GSP等法规
二、复杂查询核心技能实战
1. SQL执行顺序深度解析
sql
-- 典型医药查询示例
SELECT
d.drug_name,
SUM(i.quantity) AS total_stock,
COUNT(DISTINCT o.order_id) AS order_count
FROM drug_master d
INNER JOIN inventory i ON d.drug_id = i.drug_id
LEFT JOIN orders o ON d.drug_id = o.drug_id
WHERE
d.expiration_date > CURRENT_DATE
AND i.warehouse_id IN (101, 102, 103)
GROUP BY d.drug_id
HAVING total_stock > 500
ORDER BY total_stock DESC
LIMIT 10;
执行顺序可视化:
FROM drug_master INNER JOIN inventory WHERE条件过滤 LEFT JOIN orders GROUP BY聚合 HAVING过滤 SELECT投影 DISTINCT去重 ORDER BY排序 LIMIT分页
2. 多表连接优化策略
2.1 左连接与右连接的选择
sql
-- 查询所有已过期但仍有库存的药品
SELECT d.drug_name, i.quantity
FROM drug_master d
LEFT JOIN inventory i
ON d.drug_id = i.drug_id
AND i.quantity > 0
WHERE d.expiration_date < CURRENT_DATE;
2.2 笛卡尔积的风险控制
sql
-- 安全写法:显式指定连接条件
SELECT c.carrier, l.tracking_number
FROM logistics l
CROSS JOIN carriers c
WHERE l.carrier = c.carrier_code;
3. 子查询与CTE的医药应用
3.1 药品批次追溯
sql
-- 子查询实现
SELECT d.drug_name, o.order_date
FROM drug_master d
WHERE d.batch_number IN (
SELECT batch_number
FROM recall_notices
WHERE recall_date BETWEEN '2024-01-01' AND '2024-12-31'
);
-- CTE优化
WITH recalled_batches AS (
SELECT batch_number
FROM recall_notices
WHERE recall_date BETWEEN '2024-01-01' AND '2024-12-31'
)
SELECT d.drug_name, o.order_date
FROM drug_master d
JOIN recalled_batches rb
ON d.batch_number = rb.batch_number
JOIN orders o
ON d.drug_id = o.drug_id;
4. 窗口函数高级应用
4.1 库存周转率分析
sql
SELECT
drug_id,
quantity,
DATE_TRUNC('month', last_updated) AS month,
SUM(quantity) OVER (PARTITION BY drug_id ORDER BY DATE_TRUNC('month', last_updated)) AS rolling_stock,
AVG(quantity) OVER (PARTITION BY drug_id) AS avg_stock
FROM inventory;
4.2 药品效期预警
sql
SELECT
drug_name,
expiration_date,
DENSE_RANK() OVER (ORDER BY expiration_date) AS expiry_rank,
LAG(expiration_date) OVER (ORDER BY expiration_date) AS previous_expiry
FROM drug_master;
5. CASE语句动态分类
sql
-- 药品风险等级评估
SELECT
drug_name,
CASE
WHEN expiration_date < CURRENT_DATE + INTERVAL '3 months' THEN '高风险'
WHEN expiration_date < CURRENT_DATE + INTERVAL '6 months' THEN '中风险'
ELSE '低风险'
END AS risk_level
FROM drug_master;
三、性能优化核心技巧
1. 执行计划深度解析
sql
EXPLAIN ANALYZE
SELECT *
FROM drug_master
WHERE approval_number = '国药准字H20230001';
关键指标解读:
指标名称 | 医药场景解读 |
---|---|
Index Scan | 理想情况,例如通过approval_number索引快速定位药品信息 |
Full Table Scan | 需避免,常见于无索引的模糊查询 |
Rows Removed by Filter | 过滤效率,高值说明大量数据在内存中过滤,应优化索引策略 |
Parallelism | 并行查询,适用于大规模药品数据分析(需硬件支持) |
2. 索引优化策略
2.1 复合索引设计
sql
-- 创建复合索引示例
CREATE INDEX idx_drug_warehouse ON inventory(drug_id, warehouse_id);
-- 优化查询
SELECT *
FROM inventory
WHERE drug_id = 123 AND warehouse_id = 101;
2.2 覆盖索引实践
sql
-- 创建覆盖索引
CREATE INDEX idx_drug_coverage ON drug_master(drug_name, specification, expiration_date) INCLUDE (approval_number);
-- 查询优化
SELECT drug_name, specification, expiration_date
FROM drug_master
WHERE approval_number = '国药准字H20230001';
3. 避免全表扫描
sql
-- 反例:函数导致索引失效
SELECT *
FROM inventory
WHERE DATE_FORMAT(last_updated, '%Y-%m') = '2024-03';
-- 优化方案
SELECT *
FROM inventory
WHERE last_updated >= '2024-03-01'
AND last_updated < '2024-04-01';
4. 分页优化
sql
-- 低效分页
SELECT *
FROM orders
LIMIT 100000, 10;
-- 优化方案
SELECT *
FROM orders
WHERE order_id > 100000
ORDER BY order_id
LIMIT 10;
5. 数据传输优化
sql
-- 反例:传输冗余字段
SELECT *
FROM logistics
WHERE carrier = '顺丰速运';
-- 优化方案
SELECT logistics_id, tracking_number, arrival_date
FROM logistics
WHERE carrier = '顺丰速运';
6. 分区表设计
sql
-- 按月份分区示例
CREATE TABLE order_history (
order_id INT,
order_date DATE,
amount DECIMAL(10,2)
) PARTITION BY RANGE (YEAR(order_date) * 100 + MONTH(order_date)) (
PARTITION p202301 VALUES LESS THAN (202301),
PARTITION p202302 VALUES LESS THAN (202302),
...
);
四、高级优化场景实战
1. JOIN顺序优化
sql
-- 小表驱动大表
SELECT *
FROM drug_categories dc
STRAIGHT_JOIN inventory i
ON dc.category_id = i.category_id
WHERE dc.category_name = '抗肿瘤药';
2. 临时表与物化视图
sql
-- 创建临时表缓存高频查询
CREATE TEMPORARY TABLE temp_high_value_orders AS
SELECT order_id, drug_id, amount
FROM orders
WHERE amount > 10000;
-- 创建物化视图
CREATE MATERIALIZED VIEW mv_monthly_sales AS
SELECT
DATE_TRUNC('month', order_date) AS month,
SUM(amount) AS total_sales
FROM orders
GROUP BY month;
3. 统计信息更新
sql
-- MySQL手动更新统计信息
ANALYZE TABLE drug_master;
-- PostgreSQL
VACUUM ANALYZE drug_master;
4. 锁竞争优化
sql
-- 乐观锁实现库存扣减
UPDATE inventory
SET quantity = quantity - 10
WHERE drug_id = 123
AND quantity >= 10;
五、工具与调试
1. 执行计划分析工具
工具名称 | 医药行业适用场景 |
---|---|
EXPLAIN FORMAT=JSON | MySQL复杂查询分析 |
EXPLAIN ANALYZE | PostgreSQL实时执行计划分析 |
pg_stat_statements | PostgreSQL慢查询监控 |
2. 监控方案设计
Prometheus 抓取数据库指标 Grafana 展示慢查询趋势 显示锁等待情况 监控索引使用率
3. 基准测试
bash
# sysbench测试命令示例
sysbench \
--test=oltp_read_write \
--mysql-db=pharmacy_erp \
--mysql-user=benchmark \
--mysql-password=secure \
--table_size=1000000 \
run
六、常见误区与解决方案
1. 索引误用场景
sql
-- 反例:低选择性字段索引
CREATE INDEX idx_low_selectivity ON drug_master(status);
-- 优化建议:仅为高选择性字段创建索引
2. NULL值处理
sql
-- 错误写法
SELECT *
FROM drug_master
WHERE approval_number = NULL;
-- 正确写法
SELECT *
FROM drug_master
WHERE approval_number IS NULL;
3. 子查询优化
sql
-- 反例: correlated subquery
SELECT drug_name
FROM drug_master d
WHERE EXISTS (
SELECT 1
FROM inventory i
WHERE i.drug_id = d.drug_id
AND i.quantity < 10
);
-- 优化方案:JOIN替代
SELECT DISTINCT d.drug_name
FROM drug_master d
JOIN inventory i
ON d.drug_id = i.drug_id
WHERE i.quantity < 10;
七、行业最佳实践
1. 数据归档策略
sql
-- 按年归档历史订单
CREATE TABLE orders_archive_2023 AS
SELECT *
FROM orders
WHERE order_date < '2024-01-01';
-- 删除历史数据
DELETE FROM orders
WHERE order_date < '2024-01-01';
2. 数据脱敏处理
sql
-- 使用函数进行脱敏
SELECT
order_id,
CONCAT(LEFT(customer_name, 1), '***', RIGHT(customer_name, 1)) AS masked_name,
MD5(customer_email) AS hashed_email
FROM orders;
3. 事务隔离级别设置
sql
-- 可重复读隔离级别(默认)
SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ;
-- 更高一致性要求
SET SESSION TRANSACTION ISOLATION LEVEL SERIALIZABLE;
八、性能优化路线图
阶段 | 优化目标 | 关键措施 |
---|---|---|
诊断阶段 | 识别性能瓶颈 | 执行计划分析、慢查询日志监控 |
优化阶段 | 消除全表扫描 | 创建合适索引、优化查询语句 |
巩固阶段 | 建立监控体系 | Prometheus+Grafana实时监控 |
维护阶段 | 持续性能优化 | 定期更新统计信息、分析索引使用情况 |
九、总结与展望
在医药ERP系统中,高效的SQL查询和数据库性能直接影响业务运营效率和合规性。通过掌握复杂查询技术、应用性能优化策略、合理使用工具,并结合行业特性进行针对性设计,能够构建稳定、高效的数据库系统。未来随着AI技术的发展,自动化查询优化工具将成为趋势,值得医药行业从业者持续关注。