报表是ERP里最被低估的模块。
很多企业上ERP,只关注业务流程,报表随便做做。上线后发现:要的数据查不到,查到的数据不准,做报表比做业务还累。
这篇文章从架构层面讲清楚ERP报表系统怎么设计。
一、报表系统的核心问题
- 实时性 vs 性能
业务数据实时变化,但报表查询不能卡死业务系统。
直接在业务库上跑报表,并发高的时候会把数据库拖垮。
- 灵活性 vs 标准化
每个部门要的报表不一样。销售要业绩排名,仓库要库存周转,财务要利润分析。
如果每个报表都单独开发,工作量巨大。如果不开发,用户就觉得系统不好用。
- 准确性 vs 复杂性
业务数据经过多环节流转,报表计算涉及多层关联。关联越多,出错的可能性越大。
二、报表架构设计
- 三层架构
业务数据库 → 数据仓库 → 报表服务层 → 前端展示
业务数据库只负责业务操作,不跑报表。
数据仓库负责数据汇总和计算。
报表服务层负责查询和分发。
这样业务和报表互不影响。
- 业务数据库设计
报表相关的字段要在业务表里预留。
CREATE TABLE sa_order (
id BIGINT PRIMARY KEY AUTO_INCREMENT,
order_no VARCHAR(30) NOT NULL,
customer_id BIGINT NOT NULL,
order_date DATE NOT NULL,
total_amount DECIMAL(18,2) NOT NULL,
-- 报表预留字段
region_code VARCHAR(10), -- 区域编码,用于区域销售分析
product_category VARCHAR(30), -- 产品大类,用于品类分析
sales_channel VARCHAR(20), -- 销售渠道,用于渠道分析
department_id BIGINT, -- 部门,用于部门业绩
-- 时间维度
create_time DATETIME NOT NULL,
update_time DATETIME,
INDEX idx_date (order_date),
INDEX idx_customer (customer_id),
INDEX idx_region (region_code)
);
这些字段在业务操作时可能用不到,但对报表至关重要。
- 数据仓库(DW)设计
数据仓库和业务数据库的核心区别:面向查询优化,不是面向写入优化。
3.1 维度建模
使用星型模型,事实表+维度表。
-- 销售事实表
CREATE TABLE dw_sales_fact (
id BIGINT PRIMARY KEY AUTO_INCREMENT,
-- 维度外键
date_key INT NOT NULL, -- 关联时间维度
customer_key INT NOT NULL, -- 关联客户维度
product_key INT NOT NULL, -- 关联产品维度
region_key INT, -- 关联区域维度
salesperson_key INT, -- 关联销售员维度
channel_key INT, -- 关联渠道维度
-- 度量值
order_count INT DEFAULT 0, -- 订单数
order_amount DECIMAL(18,2) DEFAULT 0, -- 订单金额
cost_amount DECIMAL(18,2) DEFAULT 0, -- 成本金额
profit_amount DECIMAL(18,2) DEFAULT 0, -- 利润金额
INDEX idx_date (date_key),
INDEX idx_customer (customer_key),
INDEX idx_product (product_key)
);
-- 时间维度表
CREATE TABLE dim_date (
date_key INT PRIMARY KEY,
full_date DATE NOT NULL,
year INT,
quarter INT,
month INT,
week_of_year INT,
day_of_week INT,
is_weekend TINYINT,
is_holiday TINYINT,
fiscal_year INT,
fiscal_quarter INT,
fiscal_month INT
);
-- 产品维度表
CREATE TABLE dim_product (
product_key INT PRIMARY KEY,
product_id BIGINT NOT NULL,
product_code VARCHAR(30),
product_name VARCHAR(100),
category_level1 VARCHAR(30), -- 一级分类
category_level2 VARCHAR(30), -- 二级分类
category_level3 VARCHAR(30), -- 三级分类
brand VARCHAR(50),
unit VARCHAR(10),
is_active TINYINT DEFAULT 1,
created_date DATE,
modified_date DATE
);
-- 客户维度表
CREATE TABLE dim_customer (
customer_key INT PRIMARY KEY,
customer_id BIGINT NOT NULL,
customer_code VARCHAR(30),
customer_name VARCHAR(100),
region VARCHAR(30),
city VARCHAR(30),
industry VARCHAR(50),
customer_level VARCHAR(20), -- 客户等级
credit_limit DECIMAL(18,2),
is_active TINYINT DEFAULT 1
);
-- 销售员维度表
CREATE TABLE dim_salesperson (
salesperson_key INT PRIMARY KEY,
user_id BIGINT NOT NULL,
user_name VARCHAR(50),
department VARCHAR(50),
position VARCHAR(30),
region VARCHAR(30),
entry_date DATE,
is_active TINYINT DEFAULT 1
);
3.2 ETL过程
数据从业务库抽取到数据仓库,需要经过清洗和转换。
class SalesETL:
"""销售数据ETL"""
def extract(self, start_date, end_date):
"""从业务库抽取数据"""
sql = """
SELECT
o.id AS order_id,
o.order_no,
o.customer_id,
o.order_date,
o.total_amount AS order_amount,
od.product_id,
od.quantity,
od.unit_price,
od.cost_price,
od.quantity * od.cost_price AS cost_amount,
o.total_amount - SUM(od.quantity * od.cost_price) OVER
(PARTITION BY o.id) AS profit_amount,
o.region_code,
o.salesperson_id,
o.sales_channel
FROM sa_order o
JOIN sa_order_detail od ON o.id = od.order_id
WHERE o.order_date BETWEEN %s AND %s
AND o.status = 'CONFIRMED'
"""
return self.source_db.query(sql, (start_date, end_date))
def transform(self, raw_data):
"""数据转换"""
result = []
for row in raw_data:
# 日期维度
date_key = self.get_date_key(row['order_date'])
# 客户维度
customer_key = self.get_customer_key(row['customer_id'])
# 产品维度
product_key = self.get_product_key(row['product_id'])
# 销售员维度
salesperson_key = self.get_salesperson_key(row['salesperson_id'])
# 区域维度
region_key = self.get_region_key(row['region_code'])
# 渠道维度
channel_key = self.get_channel_key(row['sales_channel'])
result.append({
'date_key': date_key,
'customer_key': customer_key,
'product_key': product_key,
'region_key': region_key,
'salesperson_key': salesperson_key,
'channel_key': channel_key,
'order_amount': row['order_amount'],
'cost_amount': row['cost_amount'],
'profit_amount': row['profit_amount'],
'order_count': 1
})
return result
def load(self, transformed_data):
"""加载到数据仓库"""
# 先删除当天的数据(支持重跑)
# 再插入新数据
self.dw_db.batch_insert('dw_sales_fact', transformed_data)
3.3 调度策略
ETL不能实时跑,太耗资源。按以下频率调度:
etl_schedule:
# 日汇总表:每天凌晨1点跑前一天的数据
dw_sales_daily:
schedule: "0 1 * * *"
time_range: "yesterday"
# 月汇总表:每月1号凌晨3点跑上个月的数据
dw_sales_monthly:
schedule: "0 3 1 * *"
time_range: "last_month"
# 实时快照表:每15分钟跑一次(只增量)
dw_sales_realtime:
schedule: "*/15 * * * *"
time_range: "last_15_minutes"
三、汇总表设计
直接在事实表上聚合查询,数据量大的时候很慢。需要预先建汇总表。
- 日汇总
CREATE TABLE dw_sales_daily_summary (
date_key INT NOT NULL,
region_key INT,
product_category VARCHAR(30),
order_count INT,
order_amount DECIMAL(18,2),
cost_amount DECIMAL(18,2),
profit_amount DECIMAL(18,2),
avg_order_amount DECIMAL(18,2),
profit_rate DECIMAL(5,4),
PRIMARY KEY (date_key, region_key, product_category)
);
- 月汇总
CREATE TABLE dw_sales_monthly_summary (
year_month INT NOT NULL, -- 202605
region_key INT,
product_category VARCHAR(30),
order_count INT,
order_amount DECIMAL(18,2),
cost_amount DECIMAL(18,2),
profit_amount DECIMAL(18,2),
-- 环比增长
mom_order_count_rate DECIMAL(5,4),
mom_order_amount_rate DECIMAL(5,4),
PRIMARY KEY (year_month, region_key, product_category)
);
环比增长率怎么算:
INSERT INTO dw_sales_monthly_summary
SELECT
DATE_FORMAT(s.order_date, '%%Y%%m') AS year_month,
r.region_key,
p.category_level1 AS product_category,
COUNT(DISTINCT s.id) AS order_count,
SUM(s.total_amount) AS order_amount,
-- ... 其他字段
(SUM(s.total_amount) - prev.order_amount) / prev.order_amount AS mom_rate
FROM dw_sales_fact f
-- 关联各维度表
JOIN fact_orders s ON ...
JOIN dim_region r ON ...
JOIN dim_product p ON ...
LEFT JOIN dw_sales_monthly_summary prev
ON DATE_FORMAT(s.order_date, '%%Y%%m') - INTERVAL 1 MONTH = prev.year_month
GROUP BY DATE_FORMAT(s.order_date, '%%Y%%m'), r.region_key, p.category_level1;
四、报表服务层
- 报表定义
CREATE TABLE rpt_report_config (
id BIGINT PRIMARY KEY AUTO_INCREMENT,
report_code VARCHAR(30) NOT NULL UNIQUE,
report_name VARCHAR(100) NOT NULL,
-- 数据源
data_source VARCHAR(50), -- DW/REALTIME/CUSTOM
sql_template TEXT NOT NULL, -- SQL模板,支持参数
-- 维度配置
dimensions JSON, -- 可选的分组维度
measures JSON, -- 度量字段
-- 权限
role_ids JSON, -- 允许查看的角色
-- 缓存
cache_ttl INT DEFAULT 300, -- 缓存时间(秒)
-- 状态
is_active TINYINT DEFAULT 1
);
- 动态查询
报表不是写死的SQL,是动态拼装的。
用户选了哪些维度、哪些度量,系统自动拼SQL。
public class ReportQueryBuilder
{
private readonly ReportConfig _config;
private readonly List<string> _selectedDimensions = new();
private readonly List<string> _selectedMeasures = new();
private readonly List<string> _whereConditions = new();
public ReportQueryBuilder(ReportConfig config)
{
_config = config;
}
public ReportQueryBuilder AddDimension(string dimension)
{
// 校验维度是否在报表配置中
if (_config.Dimensions.Contains(dimension))
_selectedDimensions.Add(dimension);
return this;
}
public ReportQueryBuilder AddMeasure(string measure)
{
if (_config.Measures.Contains(measure))
_selectedMeasures.Add(measure);
return this;
}
public ReportQueryBuilder AddFilter(string field, string op, object value)
{
// 防注入:只允许预定义字段
if (!_config.AllFields.Contains(field))
throw new SecurityException($"Field {field} not allowed");
_whereConditions.Add($"{field} {op} @{field}");
return this;
}
public (string sql, object param) Build()
{
var select = new StringBuilder("SELECT ");
// 维度
select.Append(string.Join(", ", _selectedDimensions));
// 度量
foreach (var measure in _selectedMeasures)
{
var agg = _config.GetAggregation(measure);
select.Append($", {agg} AS {measure}");
}
var sql = select.ToString();
sql += $" FROM {_config.DataSource}";
// WHERE
if (_whereConditions.Any())
sql += " WHERE " + string.Join(" AND ", _whereConditions);
// GROUP BY
sql += " GROUP BY " + string.Join(", ", _selectedDimensions);
// ORDER BY
sql += " ORDER BY " + string.Join(", ", _selectedDimensions);
return (sql, null);
}
}
- 报表缓存
不是每次查询都查数据库,命中缓存就直接返回。
public class ReportCacheService
{
private readonly IDatabase _db;
private readonly ICache _cache;
public ReportData GetReport(string reportCode, ReportQuery query)
{
// 生成缓存Key
var cacheKey = $"rpt:{reportCode}:{query.ToHash()}";
// 查缓存
var cached = _cache.Get<ReportData>(cacheKey);
if (cached != null)
return cached;
// 查数据库
var config = _db.GetReportConfig(reportCode);
var (sql, param) = new ReportQueryBuilder(config)
.AddDimensions(query.Dimensions)
.AddMeasures(query.Measures)
.AddFilters(query.Filters)
.Build();
var data = _db.Query<ReportData>(sql, param);
// 写缓存
_cache.Set(cacheKey, data, TimeSpan.FromSeconds(config.CacheTtl));
return data;
}
// 报表数据变更时,清除相关缓存
public void InvalidateCache(string reportCode)
{
var pattern = $"rpt:{reportCode}:*";
_cache.DeleteByPattern(pattern);
}
}
五、常见报表设计
- 销售日报
SELECT
d.full_date AS 日期,
r.region_name AS 区域,
COUNT(DISTINCT f.order_id) AS 订单数,
SUM(f.order_amount) AS 销售额,
SUM(f.profit_amount) AS 利润,
ROUND(SUM(f.profit_amount) / SUM(f.order_amount) * 100, 2) AS 利润率
FROM dw_sales_fact f
JOIN dim_date d ON f.date_key = d.date_key
JOIN dim_region r ON f.region_key = r.region_key
WHERE d.full_date = ?
GROUP BY d.full_date, r.region_key;
- 产品销量排名
SELECT
p.product_name AS 产品名称,
p.category_level1 AS 产品分类,
SUM(f.order_count) AS 销售数量,
SUM(f.order_amount) AS 销售金额,
SUM(f.profit_amount) AS 利润,
ROUND(SUM(f.profit_amount) / SUM(f.order_amount) * 100, 2) AS 利润率
FROM dw_sales_fact f
JOIN dim_product p ON f.product_key = p.product_key
WHERE f.date_key BETWEEN ? AND ?
GROUP BY p.product_key
ORDER BY SUM(f.order_amount) DESC
LIMIT 20;
- 客户ABC分析
SELECT
c.customer_name AS 客户名称,
SUM(f.order_amount) AS 累计销售额,
SUM(f.order_amount) * 100.0 / (
SELECT SUM(order_amount) FROM dw_sales_fact
WHERE date_key BETWEEN ? AND ?
) AS 占比,
@running_total := @running_total + SUM(f.order_amount) AS 累计金额,
CASE
WHEN @running_total <= (SELECT SUM(order_amount) * 0.7 FROM dw_sales_fact WHERE date_key BETWEEN ? AND ?)
THEN 'A'
WHEN @running_total <= (SELECT SUM(order_amount) * 0.9 FROM dw_sales_fact WHERE date_key BETWEEN ? AND ?)
THEN 'B'
ELSE 'C'
END AS 客户等级
FROM dw_sales_fact f
JOIN dim_customer c ON f.customer_key = c.customer_key
CROSS JOIN (SELECT @running_total := 0) AS init
WHERE f.date_key BETWEEN ? AND ?
GROUP BY c.customer_key
ORDER BY SUM(f.order_amount) DESC;
- 库存周转分析
SELECT
p.product_name AS 产品名称,
p.category_level1 AS 分类,
i.current_qty AS 当前库存,
ROUND(
COALESCE(s.total_out_qty, 0) / GREATEST(i.current_qty, 1), 2
) AS 月周转率,
ROUND(
GREATEST(i.current_qty, 0) /
GREATEST(COALESCE(s.avg_daily_out_qty, 1), 0.01), 0
) AS 可售天数
FROM dw_inventory_current i
JOIN dim_product p ON i.product_key = p.product_key
LEFT JOIN dw_inventory_summary_monthly s
ON i.product_key = s.product_key
AND s.year_month = DATE_FORMAT(CURDATE(), '%%Y%%m')
ORDER BY 可售天数 ASC;
六、报表导出
- Excel导出
大数据量报表不能一次全部加载到内存。用流式写入。
public async Task ExportToExcel(string reportCode, ReportQuery query, Stream output)
{
using var package = new ExcelPackage();
var sheet = package.Workbook.Worksheets.Add("报表");
// 写表头
var headers = query.SelectedDimensions.Concat(query.SelectedMeasures).ToList();
for (int i = 0; i < headers.Count; i++)
sheet.Cells[1, i + 1].Value = headers[i];
// 流式查询,逐行写入
int row = 2;
await foreach (var record in _db.StreamQuery(reportCode, query))
{
for (int i = 0; i < headers.Count; i++)
sheet.Cells[row, i + 1].Value = record[headers[i]];
row++;
}
await package.SaveAsAsync(output);
}
- 定时推送
有些报表需要定时发邮件给相关人员。
report_schedule:
- name: "每日销售日报"
report_code: "sales_daily"
schedule: "0 8 * * *" # 每天早上8点
recipients: ["sales_manager@company.com", "gm@company.com"]
format: "excel"
- name: "每月库存周转分析"
report_code: "inventory_turnover"
schedule: "0 9 1 * *" # 每月1号早上9点
recipients: ["warehouse_manager@company.com"]
format: "pdf"
七、性能优化要点
- 分区表
事实表按月分区,查询时自动裁剪。
- 列式存储
OLAP场景用列式存储效率更高。ClickHouse、Doris都是好选择。
- 物化视图
复杂的聚合查询,用物化视图预计算。
CREATE MATERIALIZED VIEW mv_sales_monthly_region
REFRESH COMPLETE ON DEMAND
AS
SELECT
DATE_FORMAT(order_date, '%%Y%%m') AS year_month,
region_code,
COUNT(*) AS order_count,
SUM(total_amount) AS total_amount
FROM sa_order
GROUP BY DATE_FORMAT(order_date, '%%Y%%m'), region_code;
- 慢查询监控
-- 记录报表查询时间
INSERT INTO rpt_query_log (report_code, query_time_ms, row_count, user_id)
VALUES (?, ?, ?, ?);
-- 找出慢查询
SELECT report_code, AVG(query_time_ms) AS avg_time, COUNT(*) AS query_count
FROM rpt_query_log
WHERE query_time > DATE_SUB(NOW(), INTERVAL 7 DAY)
GROUP BY report_code
HAVING avg_time > 3000
ORDER BY avg_time DESC;
报表系统做好了,ERP的价值才能真正体现。用户看到的不是一堆单据,而是能支撑决策的数据。
------云策数链