目录
方案概述
1.1 数据赋能的核心目标
makefile
从数据仓库到业务赋能的价值链:
第一阶段: 数据集成 (已完成)
├─ 统一数据、消除孤岛
├─ POS/供应链/CRM数据整合
└─ 为后续分析奠定基础
第二阶段: 深度分析 (本文核心)
├─ 发现规律、识别机会、预测未来
├─ 四大方向: 门店、营销、供应链、选址
└─ 统计、建模、预测、因果分析
第三阶段: 应用创新 (后续文档)
├─ 产品化、自动化、智能化
├─ BI系统、驾驶舱、决策引擎
└─ 赋能业务部门自助决策
最终成果: 数据驱动的灵活高效餐饮帝国
1.2 四大业务赋能方向
1️⃣ 门店运营精细化管理
- 销售预测 (日/周级精度)
- 人效优化 (班次排班、菜品搭配)
- 客流分析 (高峰预警、客源构成)
- 菜品贡献度 (销售额、利润贡献)
- 💰 预期收益: 人效 ↑8-12%, 客单价 ↑5-8%
2️⃣ 精准营销与会员运营
- 用户画像 (消费行为、偏好分析)
- 精准营销 (目标客群、推荐产品)
- 会员生命周期 (新增、活跃、流失)
- 复购率提升 (挽回、激活、升值)
- 💰 预期收益: 复购率 ↑15-25%, 客价值 ↑30%
3️⃣ 供应链全流程优化
- 需求预测 (食材、库存)
- 库存优化 (最小库存、周转率)
- 采购优化 (品质、成本、配送)
- 损耗控制 (浪费率监测)
- 💰 预期收益: 库存 ↓15-20%, 损耗 ↓30%
4️⃣ 数据驱动的选址与扩张
- 位置评估 (商圈分析、竞争格局)
- 选址建模 (成熟周期、成本回收)
- 风险识别 (失败预警、止损机制)
- 决策支持 (开店vs关店vs改型)
- 💰 预期收益: 新店成功率 ↑20-30%
门店运营深度分析
2.1 销售预测模型
预测模型架构:
makefile
数据输入:
├─ 历史销售数据 (过去18-24个月)
├─ 外部特征 (天气、节假日、促销、竞争)
├─ 门店特征 (位置、规模、类型、成熟度)
└─ 时间特征 (季节性、周期性、趋势)
模型层:
├─ L1: 基线模型 (移动平均、指数平滑) - 精度±10-15%
├─ L2: 统计模型 (ARIMA、Holt-Winters) - 精度±8-12%
└─ L3: 机器学习 (XGBoost、LightGBM) - 精度±5-8%
预测粒度:
├─ 日级: 班次排班、库存准备
├─ 周级: 促销活动、营销计划
├─ 月级: 财务预算、采购计划
└─ 季度级: 战略规划、区域对标
Python销售预测代码:
python
import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler
import xgboost as xgb
from datetime import datetime, timedelta
class SalesForecaster:
def __init__(self, shop_id, lookback_days=540):
self.shop_id = shop_id
self.lookback_days = lookback_days
self.model = None
self.scaler = StandardScaler()
def fetch_historical_data(self, db_connection):
"""从Doris获取历史数据"""
query = f"""
SELECT stat_date, shop_id, daily_sales as label,
customer_count, order_count, avg_order_price,
max_temperature, rainfall, day_of_week,
is_holiday, discount_amount, is_promo_day
FROM dws_daily_shop_analysis
WHERE shop_id = {self.shop_id}
AND stat_date >= DATE_SUB(CURDATE(), INTERVAL {self.lookback_days} DAY)
ORDER BY stat_date
"""
df = pd.read_sql(query, db_connection)
return self._engineer_features(df)
def _engineer_features(self, df):
"""特征工程"""
df['stat_date'] = pd.to_datetime(df['stat_date'])
df['month'] = df['stat_date'].dt.month
df['week'] = df['stat_date'].dt.isocalendar().week
df['sales_ma7'] = df['label'].rolling(7, min_periods=1).mean()
df['sales_ma30'] = df['label'].rolling(30, min_periods=1).mean()
df['sales_yoy'] = df['label'].shift(365)
return df.dropna()
def train(self, df):
"""训练XGBoost模型"""
feature_cols = [c for c in df.columns if c not in ['stat_date', 'shop_id', 'label']]
X = df[feature_cols]
y = df['label']
X_train, X_test, y_train, y_test = X[:int(len(X)*0.8)], X[int(len(X)*0.8):], \
y[:int(len(y)*0.8)], y[int(len(y)*0.8):]
X_train_scaled = self.scaler.fit_transform(X_train)
X_test_scaled = self.scaler.transform(X_test)
self.model = xgb.XGBRegressor(n_estimators=200, max_depth=7, learning_rate=0.1)
self.model.fit(X_train_scaled, y_train, eval_set=[(X_test_scaled, y_test)],
early_stopping_rounds=20, verbose=False)
y_pred = self.model.predict(X_test_scaled)
mape = np.mean(np.abs((y_test - y_pred) / y_test)) * 100
return {'mape': mape, 'model_ready': True}
def predict(self, future_days=30):
"""预测未来销售"""
predictions = []
for day in range(1, future_days + 1):
pred_date = datetime.now() + timedelta(days=day)
# 构造特征、预测
pred_value = self._get_forecast_value(pred_date)
std_error = 1500
predictions.append({
'date': pred_date.strftime('%Y-%m-%d'),
'forecast': round(pred_value, 2),
'lower_ci': round(max(0, pred_value - 1.96*std_error), 2),
'upper_ci': round(pred_value + 1.96*std_error, 2)
})
return predictions
def _get_forecast_value(self, pred_date):
# 实现具体预测逻辑
return 8000 # 示例
2.2 人效优化与班次排班
班次排班SQL:
sql
-- 基于销售预测的班次优化建议
CREATE TEMPORARY TABLE hourly_forecast AS
SELECT shop_id, forecast_date, hour, forecasted_orders,
CASE WHEN forecasted_orders < 20 THEN 2
WHEN forecasted_orders < 50 THEN 3
WHEN forecasted_orders < 100 THEN 4
ELSE 5
END AS recommended_staff
FROM dws_hourly_forecast
WHERE shop_id = 'SHOP001' AND forecast_date = CURDATE();
-- 对比当前排班
SELECT h.hour, c.actual_staff, h.recommended_staff,
CASE WHEN c.actual_staff > h.recommended_staff + 1 THEN '可减员'
WHEN c.actual_staff < h.recommended_staff - 1 THEN '需增员'
ELSE '正常'
END AS suggestion
FROM hourly_forecast h
LEFT JOIN current_shifts c ON h.hour = HOUR(c.shift_date)
ORDER BY h.hour;
2.3 菜品贡献度分析
sql
-- 计算菜品的销售、利润、出菜效率贡献度
SELECT
menu_id, menu_name,
COUNT(DISTINCT order_id) AS sales_count,
SUM(actual_price) AS total_revenue,
SUM(actual_price) - SUM(cost) AS total_profit,
ROUND(SUM(actual_price) / SUM(actual_price) OVER () * 100, 2) AS revenue_pct,
ROUND((SUM(actual_price) - SUM(cost)) / SUM(actual_price) * 100, 2) AS profit_margin,
ROUND(AVG(cook_time), 1) AS avg_cook_time,
CASE WHEN SUM(actual_price) / SUM(actual_price) OVER () >= 0.05 THEN '主力菜'
WHEN ROUND((SUM(actual_price) - SUM(cost)) / SUM(actual_price) * 100, 2) >= 50 THEN '高利菜'
ELSE '普通菜'
END AS menu_category
FROM ods_order_items
WHERE shop_id = 'SHOP001' AND DATE(order_time) >= DATE_SUB(CURDATE(), INTERVAL 30 DAY)
GROUP BY 1, 2
ORDER BY total_profit DESC;
市场营销数据赋能
3.1 用户画像系统
sql
-- 构建完整用户画像
CREATE TABLE dws_user_profile AS
SELECT
member_id, phone, age_group, gender,
-- 消费行为
COUNT(DISTINCT order_id) AS total_orders,
SUM(order_total) AS total_spending,
AVG(order_total) AS avg_order_value,
MAX(order_date) AS last_order_date,
DATEDIFF(CURDATE(), MAX(order_date)) AS days_since_last_order,
-- 菜品偏好
(SELECT menu_category FROM ods_order_items WHERE member_id = m.member_id
GROUP BY menu_category ORDER BY COUNT(*) DESC LIMIT 1) AS favorite_category,
-- 生命周期
CASE WHEN DATEDIFF(CURDATE(), MAX(order_date)) <= 7 THEN '高活跃'
WHEN DATEDIFF(CURDATE(), MAX(order_date)) <= 30 THEN '活跃'
WHEN DATEDIFF(CURDATE(), MAX(order_date)) <= 90 THEN '偶发'
WHEN DATEDIFF(CURDATE(), MAX(order_date)) <= 180 THEN '沉睡'
ELSE '流失'
END AS lifecycle_stage,
-- 价值评级
CASE WHEN SUM(order_total) >= (SELECT PERCENTILE_CONT(0.8) WITHIN GROUP (ORDER BY total_spending))
AND DATEDIFF(CURDATE(), MAX(order_date)) <= 30 THEN 'VIP'
WHEN SUM(order_total) >= (SELECT PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY total_spending))
AND DATEDIFF(CURDATE(), MAX(order_date)) <= 60 THEN '高价值'
ELSE '普通'
END AS value_segment
FROM dim_member m
LEFT JOIN ods_transactions t ON m.member_id = t.member_id
WHERE t.order_date >= DATE_SUB(CURDATE(), INTERVAL 365 DAY)
GROUP BY m.member_id;
3.2 会员复购率提升
python
# 会员生命周期管理与复购优化
class MemberLifecycleManager:
def calculate_churn_probability(self, member_id):
"""基于RFM模型计算流失概率"""
# R (Recency): 最后购买距离
# F (Frequency): 购买频率
# M (Monetary): 购买金额
r_score = self._get_recency_score(member_id)
f_score = self._get_frequency_score(member_id)
m_score = self._get_monetary_score(member_id)
rfm_score = (r_score + f_score + m_score) / 3
churn_prob = 1 - (rfm_score / 5) # 分数高则流失率低
return {
'rfm_score': round(rfm_score, 2),
'churn_probability': round(churn_prob, 3),
'risk_level': 'P0_即将流失' if churn_prob > 0.7 else 'P1_高风险' if churn_prob > 0.5 else '低风险'
}
def recommend_marketing_action(self, member_id):
"""推荐个性化营销行动"""
profile = self._get_member_profile(member_id)
recommendations = []
if profile['days_since_last_order'] > 7:
recommendations.append({
'action': 'push_notification',
'content': f"好久不见,{profile['favorite_category']}限时优惠",
'discount_code': self._generate_coupon(member_id),
'timing': 'lunch_time'
})
if profile['avg_order_value'] < 200:
recommendations.append({
'action': 'premium_recommendation',
'message': '您可能会喜欢这些新菜品...',
'expected_uplift': '15-20%'
})
return recommendations
def calculate_ltv(self, member_id):
"""计算客户生命周期价值 (LTV)"""
profile = self._get_member_profile(member_id)
avg_order = profile['avg_order_value']
annual_freq = 365 / max(profile['order_frequency'], 1)
lifecycle_duration = {
'新客': 1, '成长期': 2, '活跃期': 3,
'偶发期': 1.5, '沉睡期': 0.5, '流失期': 0
}
ltv = avg_order * annual_freq * lifecycle_duration.get(profile['stage'], 1)
return round(ltv, 2)
供应链优化分析
4.1 食材需求预测
python
class IngredientDemandForecaster:
"""食材需求预测系统"""
def forecast_demand(self, ingredient_id, forecast_days=14):
"""预测食材需求量"""
# 获取菜品销售预测
menu_forecast = self._get_menu_forecast(forecast_days)
# 获取菜品-食材映射
recipe_mapping = self._get_recipe_mapping()
# 计算食材需求
demand_forecast = {}
for day in range(forecast_days):
daily_demand = 0
for menu_item, qty in menu_forecast[day].items():
if ingredient_id in recipe_mapping[menu_item]:
usage = recipe_mapping[menu_item][ingredient_id]
daily_demand += qty * usage
# 加入10%安全库存
demand_forecast[day] = {
'base': daily_demand,
'with_safety': daily_demand * 1.1,
'reorder_point': daily_demand * 1.1 * 2 # 2天lead time
}
return demand_forecast
def optimize_inventory(self, ingredient_id, current_stock):
"""库存优化建议"""
forecast = self.forecast_demand(ingredient_id, 30)
total_30day = sum([d['with_safety'] for d in forecast.values()])
avg_daily = total_30day / 30
# 计算经济批量 (EOQ)
params = self._get_ingredient_params(ingredient_id)
eoq = np.sqrt((2 * avg_daily * 365 * params['order_cost']) / params['holding_cost'])
reorder_point = avg_daily * params['lead_time_days']
max_stock = reorder_point + eoq
return {
'reorder_point': round(reorder_point, 2),
'order_qty': round(eoq, 2),
'max_level': round(max_stock, 2),
'action': self._get_action(current_stock, reorder_point, max_stock)
}
def _get_action(self, current, reorder, max_level):
if current <= reorder:
return '紧急补货'
elif current >= max_level:
return '库存过量'
else:
return '正常'
4.2 库存和浪费监测
sql
-- 检测库存异常和浪费风险
SELECT
ingredient_id, ingredient_name,
current_stock,
CASE WHEN current_stock <= reorder_point THEN '缺货风险'
WHEN current_stock >= max_stock * 0.9 THEN '积压风险'
WHEN shelf_life_days - days_in_stock < 5 THEN '临期风险'
ELSE '正常'
END AS inventory_status,
-- 浪费率
ROUND(waste_amount / total_purchase * 100, 2) AS waste_rate,
CASE WHEN waste_rate > 15 THEN 'P0_高风险'
WHEN waste_rate > 8 THEN 'P1_中风险'
ELSE 'P2_低风险'
END AS waste_risk
FROM dws_ingredient_inventory
WHERE stat_date = CURDATE()
ORDER BY waste_risk;
新店选址建模方案
5.1 选址评估模型
sql
-- 商圈评估
SELECT
site_id, site_address, city, district,
resident_population_within_1km,
working_population_within_1km,
competitor_count_within_1km,
public_transport_density,
-- 综合评分
(population_score + competition_score + traffic_score) / 3 AS overall_score,
CASE WHEN competitor_count_within_1km > 3 THEN '竞争激烈'
WHEN resident_population_within_1km < 20000 THEN '人口不足'
WHEN public_transport_density < 2 THEN '交通不便'
ELSE '正常'
END AS risk_level
FROM dws_site_evaluation
ORDER BY overall_score DESC;
5.2 成功概率预测
python
from sklearn.ensemble import RandomForestClassifier
class NewStoreSuccessPredictor:
"""新店成功概率预测"""
def train(self, db_connection):
"""基于历史开店数据训练"""
query = """
SELECT location_rent, location_population_1km,
competitor_count, store_size, store_type,
CASE WHEN cumulative_roi >= 1.2 THEN 1 ELSE 0 END AS success
FROM historical_store_performance
WHERE opening_date >= DATE_SUB(CURDATE(), INTERVAL 5 YEAR)
"""
df = pd.read_sql(query, db_connection)
X = df.drop('success', axis=1)
y = df['success']
self.model = RandomForestClassifier(n_estimators=200, max_depth=15)
self.model.fit(X, y)
return {'training_samples': len(df), 'success_rate': y.mean()}
def predict(self, site_features):
"""预测成功概率"""
X = pd.DataFrame([site_features])
prob = self.model.predict_proba(X)[0][1]
return {
'success_probability': round(prob, 3),
'rating': '优秀⭐⭐⭐⭐⭐' if prob >= 0.8 else '良好⭐⭐⭐⭐' if prob >= 0.6 else '风险⭐⭐',
'recommendation': '🟢立即开店' if prob >= 0.8 else '🟡需评估' if prob >= 0.6 else '🔴不建议'
}
def estimate_financials(self, site_features):
"""估算财务指标"""
city = site_features['city']
benchmarks = self._get_benchmarks(city)
adjusted = self._adjust_for_location(site_features, benchmarks)
return {
'monthly_sales': adjusted['sales'],
'monthly_profit': adjusted['sales'] * 0.35 - adjusted['cost'],
'payback_months': site_features['rent'] / (adjusted['sales'] * 0.35 - adjusted['cost'])
}
数据文化与自助分析
6.1 自助分析赋能体系
yaml
┌─ 工具民主化 ─┬─ BI工具 (Tableau/Metabase)
│ │ └─ 50+预建模板, 拖拽分析
│ │
│ ├─ SQL客户端 (DBeaver)
│ │ └─ SQL snippet库, 查询模板
│ │
│ └─ 分析环境 (Jupyter)
│ └─ Python/R, 预建框架
┌─ 能力分级 ──┬─ L0: 阅读报表 (1小时)
│ ├─ L1: BI自助分析 (4小时)
│ ├─ L2: SQL分析 (3天)
│ └─ L3: 建模统计 (5天)
┌─ 权限治理 ──┬─ 数据分级: 公开/部门/敏感
│ ├─ 角色权限: 查看/创建/管理
│ └─ 审计日志: 谁查了什么
6.2 典型分析案例模板
场景一: 门店日常诊断 (门店经理用)
- 问题: "为什么今天销售下降?"
- 分析: 时段分布、菜品排行、客流分析、外部因素
- 数据: 日报表模板 (预建SQL)
- 结果: 找到问题根因 → 采取行动
场景二: 营销活动评估 (营销经理用)
- 问题: "这次活动ROI是多少?"
- 分析: 参与人数、转化率、增量销售、投资回报
- 数据: 活动效果模板 (按菜品/地区/会员维度)
- 结果: 评估效果 → 优化下次活动
场景三: 库存预警 (采购经理用)
- 问题: "哪些食材库存异常?"
- 分析: 自动对标所有门店、所有食材
- 数据: 库存预警表 (即刻更新)
- 结果: 及时调整采购计划
下一步: 详见08文档 - 数据产品与应用创新