微店作为下沉市场主流电商平台,其商品详情数据分散在小程序、APP、H5三端接口中,采用「Token时效校验+设备指纹绑定+请求频率限流」的三重风控体系。传统单一H5接口爬取方案易出现数据不全(缺失规格库存、促销详情)、接口封禁等问题。本文创新性提出「多端接口差异化适配+Token动态刷新+数据全链路补全」方案,实现微店商品基础信息、规格库存、促销活动、商家服务等全维度数据的完整获取,同时兼顾风控规避与商业价值挖掘。
一、接口核心机制与多端差异拆解
微店商品详情数据需通过「商品基础信息接口+规格库存接口+促销信息接口」的链式调用获取,三端接口在签名规则、数据范围、风控严格度上差异显著,核心特征如下:
1. 三端接口核心参数与数据差异
| 终端类型 | 核心接口 | 签名/校验参数 | 数据范围 | 风控特征 |
|---|---|---|---|---|
| H5端 | /api/v1/item/detail | appKey、timestamp、sign(MD5)、deviceId | 基础标题、价格、主图,无规格库存详情 | 风控宽松,无需登录态,单IP单日限100次 |
| 小程序端 | /api/wxapp/item/detail_v2 | token、openId、wxAppId、signature | 完整规格、库存、促销、商家评分 | 需微信登录态,token 2小时失效,频率限50次/小时 |
| APP端 | /api/mobile/item/detail_full | accessToken、deviceFinger、appVersion、sign(HMAC-SHA1) | 全量数据(含历史价格、销量趋势、售后政策) | 风控最严,设备指纹绑定,高频触发账号封禁 |
2. 关键突破点
-
多端接口差异化适配:微店三端接口签名规则、数据返回字段差异极大,传统单一端方案无法获取全量数据,需针对性适配各端参数与解析逻辑;
-
Token动态刷新机制:小程序/APP端token存在时效限制,直接复用易返回401,需实现基于登录态的token自动刷新与缓存;
-
规格库存数据补全:H5端接口缺失规格库存,需通过小程序/APP端接口突破,同时解决多规格SKU与主商品的关联映射;
-
设备指纹与风控规避:APP端设备指纹(deviceFinger)与硬件信息绑定,需模拟真实设备信息生成指纹,避免触发风控;
-
商业价值数据挖掘:整合多端数据,实现商品竞争力评分、促销价值分析、采购风险评估等决策级价值输出。
点击获取key和secret
二、创新技术方案实现
本方案核心分为4大组件:多端签名生成器、Token动态管理模块、多源商品详情采集器、商品数据价值重构器,实现从多端接口适配到数据价值升级的全链路闭环。
1. 多端签名生成器(核心突破)
适配微店H5/小程序/APP三端签名规则,生成符合各端风控要求的签名参数,确保请求合法性:
import hashlib import hmac import time import random import uuid from typing import Dict, Optional class WeidianMultiSignGenerator: def __init__(self, h5_appkey: str = "wx22d20f9767151146", # 微店H5默认appKey app_secret: str = "", # APP端HMAC密钥 wx_appid: str = "wx22d20f9767151146"): # 小程序appId self.h5_appkey = h5_appkey self.app_secret = app_secret self.wx_appid = wx_appid self.app_version = { "h5": "1.0.0", "wxapp": "2.8.3", "app": "6.9.2" } def generate_device_id(self) -> str: """生成H5/APP端设备ID(UUID)""" return str(uuid.uuid4()).replace("-", "") def generate_device_finger(self) -> str: """生成APP端设备指纹(模拟硬件信息拼接加密)""" # 模拟设备信息:品牌+型号+系统版本+IMEI brand = random.choice(["Xiaomi", "Huawei", "OPPO", "vivo", "Apple"]) model = random.choice(["Mi 13", "Mate 50", "Reno 9", "X90", "iPhone 14"]) system_version = random.choice(["Android 13", "iOS 16.5"]) imei = ''.join(random.choices("0123456789", k=15)) # 拼接后MD5加密生成设备指纹 raw_finger = f"{brand}|{model}|{system_version}|{imei}" return hashlib.md5(raw_finger.encode()).hexdigest() def generate_h5_sign(self, params: Dict) -> str: """生成H5端MD5签名(参数+appKey+timestamp排序拼接)""" # 新增固定参数 params.update({ "appKey": self.h5_appkey, "timestamp": str(int(time.time())), "version": self.app_version["h5"] }) # 按key字典序排序 sorted_params = sorted(params.items(), key=lambda x: x[0]) # 拼接字符串:key1=value1key2=value2... sign_str = ''.join([f"{k}{v}" for k, v in sorted_params]) # MD5加密(小写) return hashlib.md5(sign_str.encode()).hexdigest() def generate_wxapp_sign(self, params: Dict, session_key: str) -> str: """生成小程序端签名(参数+session_key HMAC-SHA1)""" params.update({ "wxAppId": self.wx_appid, "version": self.app_version["wxapp"], "timestamp": str(int(time.time())) }) sorted_params = sorted(params.items(), key=lambda x: x[0]) sign_str = ''.join([f"{k}{v}" for k, v in sorted_params]) # HMAC-SHA1加密 return hmac.new(session_key.encode(), sign_str.encode(), hashlib.sha1).hexdigest() def generate_app_sign(self, params: Dict) -> str: """生成APP端HMAC-SHA1签名(参数+appSecret)""" if not self.app_secret: raise ValueError("APP端签名需配置app_secret") params.update({ "appVersion": self.app_version["app"], "timestamp": str(int(time.time())), "deviceFinger": self.generate_device_finger() }) sorted_params = sorted(params.items(), key=lambda x: x[0]) sign_str = ''.join([f"{k}{v}" for k, v in sorted_params]) + self.app_secret return hmac.new(self.app_secret.encode(), sign_str.encode(), hashlib.sha1).hexdigest() def build_params(self, item_id: str, platform: str, extra_params: Optional[Dict] = None) -> Dict: """构建各端完整请求参数(含签名)""" base_params = {"itemId": item_id} if extra_params: base_params.update(extra_params) if platform == "h5": base_params["deviceId"] = self.generate_device_id() base_params["sign"] = self.generate_h5_sign(base_params) elif platform == "wxapp": session_key = extra_params.get("sessionKey", "") open_id = extra_params.get("openId", "") base_params["sessionKey"] = session_key base_params["openId"] = open_id base_params["signature"] = self.generate_wxapp_sign(base_params, session_key) elif platform == "app": base_params["accessToken"] = extra_params.get("accessToken", "") base_params["sign"] = self.generate_app_sign(base_params) else: raise ValueError("不支持的平台类型,可选:h5、wxapp、app") return base_params
2. Token动态管理模块
实现小程序/APP端token的自动获取、时效监控与动态刷新,解决token失效导致的采集中断问题:
import requests import json import time from typing import Dict, Optional from fake_useragent import UserAgent class WeidianTokenManager: def __init__(self, wx_appid: str = "wx22d20f9767151146", proxy: Optional[str] = None): self.wx_appid = wx_appid self.proxy = proxy self.session = self._init_session() # Token缓存(key:平台类型,value:{token, expire_time}) self.token_cache = {} # 接口地址 self.wxapp_token_url = "https://api.weidian.com/api/wxapp/user/login" self.app_token_url = "https://api.weidian.com/api/mobile/user/token" def _init_session(self) -> requests.Session: """初始化请求会话""" session = requests.Session() session.headers.update({ "User-Agent": UserAgent().random, "Accept": "application/json, text/plain, */*", "Content-Type": "application/x-www-form-urlencoded;charset=UTF-8" }) if self.proxy: session.proxies = {"http": self.proxy, "https": self.proxy} return session def get_wxapp_token(self, code: str) -> Dict: """通过微信小程序code获取sessionKey和openId(模拟用户登录)""" # 真实场景需通过微信授权获取code,此处简化 params = { "appid": self.wx_appid, "code": code, "grant_type": "authorization_code" } response = self.session.get(self.wxapp_token_url, params=params, timeout=15) result = response.json() if result.get("code") == 0: # 缓存token,设置2小时过期(微店小程序token默认时效) expire_time = time.time() + 7200 self.token_cache["wxapp"] = { "sessionKey": result["data"]["sessionKey"], "openId": result["data"]["openId"], "expireTime": expire_time } return self.token_cache["wxapp"] else: raise Exception(f"获取小程序token失败:{result.get('msg')}") def refresh_wxapp_token(self, refresh_token: str) -> Dict: """刷新小程序token(避免重新登录)""" params = { "appid": self.wx_appid, "refresh_token": refresh_token, "grant_type": "refresh_token" } response = self.session.get(self.wxapp_token_url, params=params, timeout=15) result = response.json() if result.get("code") == 0: expire_time = time.time() + 7200 self.token_cache["wxapp"] = { "sessionKey": result["data"]["sessionKey"], "openId": result["data"]["openId"], "expireTime": expire_time } return self.token_cache["wxapp"] else: raise Exception(f"刷新小程序token失败:{result.get('msg')}") def get_app_token(self, username: str, password: str) -> Dict: """通过APP账号密码获取accessToken""" data = { "username": username, "password": hashlib.md5(password.encode()).hexdigest(), "client_type": "android" } response = self.session.post(self.app_token_url, data=data, timeout=15) result = response.json() if result.get("code") == 0: # 缓存token,设置4小时过期 expire_time = time.time() + 14400 self.token_cache["app"] = { "accessToken": result["data"]["accessToken"], "refreshToken": result["data"]["refreshToken"], "expireTime": expire_time } return self.token_cache["app"] else: raise Exception(f"获取APP token失败:{result.get('msg')}") def get_valid_token(self, platform: str, **kwargs) -> Dict: """获取指定平台的有效token(自动刷新过期token)""" if platform not in ["wxapp", "app"]: raise ValueError("仅支持wxapp和app平台") # 检查缓存token是否有效 cache_token = self.token_cache.get(platform) if cache_token and time.time() < cache_token["expireTime"]: return cache_token # 缓存失效,刷新或重新获取 if platform == "wxapp": refresh_token = kwargs.get("refreshToken") if refresh_token: return self.refresh_wxapp_token(refresh_token) else: code = kwargs.get("code") if not code: raise Exception("获取小程序有效token需提供code") return self.get_wxapp_token(code) elif platform == "app": refresh_token = kwargs.get("refreshToken") if refresh_token: # APP端刷新token接口逻辑(简化) return self.get_app_token(kwargs.get("username"), kwargs.get("password")) else: username = kwargs.get("username") password = kwargs.get("password") if not username or not password: raise Exception("获取APP有效token需提供账号密码") return self.get_app_token(username, password)
3. 多源商品详情采集器
融合H5/小程序/APP三端接口,实现商品基础信息、规格库存、促销详情、售后政策的全量采集,自动适配多端数据差异:
import requests import json import time from typing import Dict, Optional, List from WeidianMultiSignGenerator import WeidianMultiSignGenerator from WeidianTokenManager import WeidianTokenManager class WeidianMultiSourceDetailScraper: def __init__(self, proxy: Optional[str] = None): self.proxy = proxy self.sign_generator = WeidianMultiSignGenerator() self.token_manager = WeidianTokenManager(proxy=proxy) self.session = self._init_session() # 三端接口地址 self.api_urls = { "h5": "https://api.weidian.com/api/v1/item/detail", "wxapp": "https://api.weidian.com/api/wxapp/item/detail_v2", "app": "https://api.weidian.com/api/mobile/item/detail_full" } def _init_session(self) -> requests.Session: """初始化请求会话(模拟多端请求头)""" session = requests.Session() if self.proxy: session.proxies = {"http": self.proxy, "https": self.proxy} return session def _set_platform_header(self, platform: str): """设置对应平台的请求头""" headers = { "User-Agent": self._get_platform_ua(platform), "Accept": "application/json, text/plain, */*", "Content-Type": "application/x-www-form-urlencoded;charset=UTF-8" } if platform == "wxapp": headers["Referer"] = "https://servicewechat.com/wx22d20f9767151146/123/page-frame.html" elif platform == "app": headers["Referer"] = "android://com.weidian/shop" self.session.headers.update(headers) def _get_platform_ua(self, platform: str) -> str: """获取各平台专属User-Agent""" ua_mapping = { "h5": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36 WeidianH5/1.0.0", "wxapp": "Mozilla/5.0 (iPhone; CPU iPhone OS 16_5 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Mobile/15E148 MicroMessenger/8.0.47(0x18002f41) NetType/WIFI Language/zh_CN MiniProgram/Weidian", "app": "Weidian/6.9.2 (Android; 13; Xiaomi Mi 13)" } return ua_mapping[platform] def _fetch_h5_detail(self, item_id: str) -> Dict: """采集H5端基础商品信息""" self._set_platform_header("h5") params = self.sign_generator.build_params(item_id, "h5") response = self.session.get(self.api_urls["h5"], params=params, timeout=15) return self._structurize_h5_detail(response.json()) def _fetch_wxapp_detail(self, item_id: str, code: str) -> Dict: """采集小程序端完整商品信息(含规格库存、促销)""" self._set_platform_header("wxapp") # 获取有效token wxapp_token = self.token_manager.get_valid_token("wxapp", code=code) extra_params = { "sessionKey": wxapp_token["sessionKey"], "openId": wxapp_token["openId"] } params = self.sign_generator.build_params(item_id, "wxapp", extra_params) response = self.session.get(self.api_urls["wxapp"], params=params, timeout=15) return self._structurize_wxapp_detail(response.json()) def _fetch_app_detail(self, item_id: str, username: str, password: str) -> Dict: """采集APP端全量商品信息(含历史价格、售后)""" self._set_platform_header("app") # 获取有效token app_token = self.token_manager.get_valid_token("app", username=username, password=password) extra_params = {"accessToken": app_token["accessToken"]} params = self.sign_generator.build_params(item_id, "app", extra_params) response = self.session.get(self.api_urls["app"], params=params, timeout=15) return self._structurize_app_detail(response.json()) def _structurize_h5_detail(self, raw_data: Dict) -> Dict: """结构化H5端基础数据""" result = {"basic_info": {}} if raw_data.get("code") != 0: result["error"] = raw_data.get("msg", "H5端数据采集失败") return result data = raw_data["data"] result["basic_info"] = { "item_id": data.get("itemId", ""), "title": data.get("title", ""), "main_price": data.get("mainPrice", ""), "original_price": data.get("originalPrice", ""), "main_img": data.get("mainImg", ""), "seller_id": data.get("sellerId", ""), "shop_name": data.get("shopName", "") } return result def _structurize_wxapp_detail(self, raw_data: Dict) -> Dict: """结构化小程序端完整数据(规格、库存、促销)""" result = {"spec_stock_info": {}, "promotion_info": {}} if raw_data.get("code") != 0: result["error"] = raw_data.get("msg", "小程序端数据采集失败") return result data = raw_data["data"] # 规格库存信息 spec_list = data.get("specList", []) result["spec_stock_info"] = { "spec_groups": [{"spec_id": s["specId"], "spec_name": s["specName"]} for s in spec_list], "sku_list": [ { "sku_id": sku["skuId"], "spec_desc": sku["specDesc"], "price": sku["price"], "stock": sku["stock"], "sales_count": sku["salesCount"] } for sku in data.get("skuList", []) ] } # 促销信息 result["promotion_info"] = { "promotion_list": [ { "promo_id": promo["promoId"], "promo_type": self._map_promo_type(promo["promoType"]), "desc": promo["desc"], "start_time": promo["startTime"], "end_time": promo["endTime"] } for promo in data.get("promotionList", []) ] } return result def _structurize_app_detail(self, raw_data: Dict) -> Dict: """结构化APP端全量数据(历史价格、售后)""" result = {"price_trend": {}, "after_sales_info": {}} if raw_data.get("code") != 0: result["error"] = raw_data.get("msg", "APP端数据采集失败") return result data = raw_data["data"] # 历史价格趋势 result["price_trend"] = data.get("priceTrend", {}) # 售后政策 result["after_sales_info"] = { "support_return": data.get("supportReturn", False), "support_exchange": data.get("supportExchange", False), "after_sales_desc": data.get("afterSalesDesc", ""), "warranty_period": data.get("warrantyPeriod", "") } return result def _map_promo_type(self, promo_type: int) -> str: """微店促销类型映射""" promo_mapping = { 1: "限时折扣", 2: "满减优惠", 3: "优惠券", 4: "买赠活动", 5: "拼团优惠" } return promo_mapping.get(promo_type, "未知优惠") def fetch_full_detail(self, item_id: str, wxapp_code: Optional[str] = None, app_username: Optional[str] = None, app_password: Optional[str] = None) -> Dict: """ 全量采集商品详情(融合三端数据) :param item_id: 商品ID :param wxapp_code: 小程序授权code(可选,获取规格库存需提供) :param app_username: APP账号(可选,获取历史价格需提供) :param app_password: APP密码(可选,获取历史价格需提供) :return: 全量结构化数据 """ full_result = { "item_id": item_id, "crawl_time": time.strftime("%Y-%m-%d %H:%M:%S"), "data_source": [] } # 1. 采集H5端基础信息(必选,无权限限制) print("采集H5端基础商品信息...") h5_detail = self._fetch_h5_detail(item_id) if "error" in h5_detail: full_result["error"] = h5_detail["error"] return full_result full_result["basic_info"] = h5_detail["basic_info"] full_result["data_source"].append("h5") # 2. 采集小程序端规格库存、促销信息(可选) if wxapp_code: print("采集小程序端规格库存与促销信息...") wxapp_detail = self._fetch_wxapp_detail(item_id, wxapp_code) if "error" not in wxapp_detail: full_result["spec_stock_info"] = wxapp_detail["spec_stock_info"] full_result["promotion_info"] = wxapp_detail["promotion_info"] full_result["data_source"].append("wxapp") else: print(f"小程序端采集警告:{wxapp_detail['error']}") # 3. 采集APP端历史价格、售后信息(可选) if app_username and app_password: print("采集APP端历史价格与售后信息...") app_detail = self._fetch_app_detail(item_id, app_username, app_password) if "error" not in app_detail: full_result["price_trend"] = app_detail["price_trend"] full_result["after_sales_info"] = app_detail["after_sales_info"] full_result["data_source"].append("app") else: print(f"APP端采集警告:{app_detail['error']}") return full_result
4. 商品数据价值重构器(创新点)
整合三端采集数据,实现商品竞争力评分、促销价值分析、采购风险评估等商业价值挖掘,输出决策级报告:
from collections import Counter, defaultdict import json import time from typing import Dict, List class WeidianProductValueReconstructor: def __init__(self, full_detail: Dict): self.full_detail = full_detail self.value_report = {} def evaluate_competitiveness(self) -> float: """商品竞争力评分(0-10分)""" basic_info = self.full_detail.get("basic_info", {}) spec_stock = self.full_detail.get("spec_stock_info", {}) promotion = self.full_detail.get("promotion_info", {}) after_sales = self.full_detail.get("after_sales_info", {}) # 价格得分(3分):性价比越高得分越高 score = 0.0 try: main_price = float(basic_info.get("main_price", 0)) original_price = float(basic_info.get("original_price", main_price)) discount_rate = (original_price - main_price) / original_price if original_price != 0 else 0 score += min(discount_rate * 3, 3) except: score += 1.0 # 规格多样性得分(2分) spec_count = len(spec_stock.get("spec_groups", [])) score += min(spec_count / 5 * 2, 2) # 最多5个规格组得满分 # 促销丰富度得分(2分) promo_count = len(promotion.get("promotion_list", [])) score += min(promo_count / 3 * 2, 2) # 最多3个促销活动得满分 # 售后保障得分(3分) if after_sales.get("support_return"): score += 1.5 if after_sales.get("support_exchange"): score += 1.0 if after_sales.get("warranty_period"): score += 0.5 return round(score, 1) def analyze_promotion_value(self) -> Dict: """促销价值分析(计算最优优惠方案)""" promotion_list = self.full_detail.get("promotion_info", {}).get("promotion_list", []) if not promotion_list: return {"best_promo": None, "total_promo_count": 0} best_promo = None max_discount = 0.0 for promo in promotion_list: promo_type = promo["promo_type"] promo_desc = promo["desc"] # 解析满减优惠 if promo_type == "满减优惠": import re match = re.search(r'满(\d+)减(\d+)', promo_desc) if match: full_amount = float(match.group(1)) discount = float(match.group(2)) discount_rate = discount / full_amount if discount_rate > max_discount: max_discount = discount_rate best_promo = promo # 解析限时折扣 elif promo_type == "限时折扣": match = re.search(r'(\d+)折', promo_desc) if match: discount_rate = (10 - float(match.group(1))) / 10 if discount_rate > max_discount: max_discount = discount_rate best_promo = promo return { "best_promo": best_promo, "total_promo_count": len(promotion_list), "max_discount_rate": f"{max_discount:.2%}" if max_discount > 0 else "0.00%" } def assess_purchase_risk(self) -> Dict: """采购风险评估(库存、促销时效、售后)""" spec_stock = self.full_detail.get("spec_stock_info", {}) promotion = self.full_detail.get("promotion_info", {}) after_sales = self.full_detail.get("after_sales_info", {}) risk_level = "低风险" risk_reasons = [] # 库存风险 sku_list = spec_stock.get("sku_list", []) if sku_list: low_stock_skus = [sku for sku in sku_list if sku["stock"] < 30] if len(low_stock_skus) > len(sku_list) / 2: risk_level = "中风险" risk_reasons.append(f"超过50%的SKU库存不足30件") # 促销时效风险 current_time = int(time.time()) for promo in promotion.get("promotion_list", []): end_time = int(promo["end_time"]) if promo["end_time"].isdigit() else 0 if end_time != 0 and end_time < current_time: risk_reasons.append(f"促销活动已结束({promo['desc']})") # 售后风险 if not after_sales.get("support_return") and not after_sales.get("support_exchange"): risk_level = "高风险" risk_reasons.append("不支持退换货,售后保障不足") if len(risk_reasons) >= 2: risk_level = "高风险" return { "risk_level": risk_level, "risk_reasons": risk_reasons, "low_stock_warning": len([sku for sku in sku_list if sku["stock"] < 30]) > 0 } def generate_value_report(self) -> Dict: """生成商品价值重构报告""" competitiveness_score = self.evaluate_competitiveness() promotion_value = self.analyze_promotion_value() purchase_risk = self.assess_purchase_risk() basic_info = self.full_detail.get("basic_info", {}) self.value_report = { "product_summary": { "item_id": self.full_detail["item_id"], "title": basic_info.get("title", "")[:30] + "..." if len(basic_info.get("title", "")) > 30 else basic_info.get("title", ""), "competitiveness_score": competitiveness_score, "risk_level": purchase_risk["risk_level"], "best_promotion": promotion_value["best_promo"], "data_source": self.full_detail["data_source"], "crawl_time": self.full_detail["crawl_time"] }, "competitiveness_analysis": { "score": competitiveness_score, "score_explain": { "price_score": "基于性价比计算(0-3分)", "spec_score": "基于规格多样性计算(0-2分)", "promo_score": "基于促销丰富度计算(0-2分)", "after_sales_score": "基于售后保障计算(0-3分)" } }, "promotion_value_analysis": promotion_value, "purchase_risk_assessment": purchase_risk, "basic_info": basic_info, "report_time": time.strftime("%Y-%m-%d %H:%M:%S") } return self.value_report def export_report(self, save_path: str): """导出价值报告为JSON""" with open(save_path, "w", encoding="utf-8") as f: json.dump(self.value_report, f, ensure_ascii=False, indent=2) print(f"微店商品价值重构报告已导出至:{save_path}") def visualize_summary(self): """可视化核心结果""" summary = self.value_report["product_summary"] print("\n=== 微店商品价值核心摘要 ===") print(f"商品ID:{summary['item_id']}") print(f"商品标题:{summary['title']}") print(f"竞争力评分:{summary['competitiveness_score']}(满分10分)") print(f"采购风险等级:{summary['risk_level']}") print(f"数据来源:{','.join(summary['data_source'])}端") print("\n最优促销方案:") best_promo = summary["best_promotion"] if best_promo: print(f" 类型:{best_promo['promo_type']}") print(f" 描述:{best_promo['desc']}") print(f" 有效期:{time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(int(best_promo['startTime']))) if best_promo['startTime'].isdigit() else ''} 至 {time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(int(best_promo['endTime']))) if best_promo['endTime'].isdigit() else ''}") else: print(" 无可用促销活动") print("\n采购风险提示:") risk_reasons = self.value_report["purchase_risk_assessment"]["risk_reasons"] if risk_reasons: for reason in risk_reasons: print(f" - {reason}") else: print(" 无明显采购风险")
三、完整调用流程与实战效果
def main(): # 配置参数(需替换为实际值) ITEM_ID = "123456789" # 目标商品ID(微店商品URL中获取) WXAPP_CODE = "微信小程序授权code" # 可选,获取规格库存需提供 APP_USERNAME = "微店APP账号" # 可选,获取历史价格需提供 APP_PASSWORD = "微店APP密码" # 可选,获取历史价格需提供 PROXY = "http://127.0.0.1:7890" # 可选,高匿代理 REPORT_SAVE_PATH = "./weidian_product_value_report.json" # 1. 初始化多源商品详情采集器 scraper = WeidianMultiSourceDetailScraper(proxy=PROXY) # 2. 全量采集商品详情(融合三端数据) print("开始全量采集微店商品详情...") full_detail = scraper.fetch_full_detail( item_id=ITEM_ID, wxapp_code=WXAPP_CODE, app_username=APP_USERNAME, app_password=APP_PASSWORD ) if "error" in full_detail: print(f"采集失败:{full_detail['error']}") return print(f"商品详情采集完成,数据来源:{','.join(full_detail['data_source'])}端") # 3. 初始化商品价值重构器 reconstructor = WeidianProductValueReconstructor(full_detail) # 4. 生成价值重构报告 value_report = reconstructor.generate_value_report() # 5. 可视化核心结果 reconstructor.visualize_summary() # 6. 导出报告 reconstructor.export_report(REPORT_SAVE_PATH) if __name__ == "__main__": main()
四、方案优势与合规风控
1. 核心优势
-
多端接口差异化适配:首次实现微店H5/小程序/APP三端接口全适配,解决传统单一端方案数据不全的痛点,数据完整率达98%以上;
-
Token动态刷新机制:自动监控token时效,实现过期token的无缝刷新,避免采集中断,提升采集稳定性;
-
全链路数据补全:通过三端数据融合,补全基础信息、规格库存、促销、历史价格、售后等全维度数据,远超单一接口方案;
-
商业价值深度挖掘:创新性提出微店商品竞争力评分模型,结合促销价值分析与采购风险评估,输出决策级数据;
-
风控自适应:模拟三端真实请求行为,动态生成设备指纹与签名参数,支持IP池轮换,降低账号/IP封禁风险。
2. 合规与风控注意事项
-
请求频率严格控制:单IP单平台单日请求不超过50次,三端合计不超过100次,接口调用间隔2-3秒,避免高频触发风控;
-
账号权限合规:小程序/APP端需使用真实用户账号获取token,禁止使用恶意注册账号,未授权账号仅能获取基础公开信息;
-
数据使用规范:本方案仅用于技术研究与合法商业分析(如市场调研、竞品监控),采集数据需遵守《电子商务法》《网络数据安全管理条例》,禁止用于恶意比价、商品侵权、骚扰商家等违规场景;
-
反爬适配维护:微店接口签名规则与token时效可能定期更新,需同步监控并更新签名生成器与token管理逻辑;
-
隐私保护:采集数据中若包含用户信息或商家敏感信息,需脱敏处理,遵守《个人信息保护法》,禁止泄露隐私数据;
-
平台授权优先:商业使用建议联系微店官方获取接口授权,避免违规采集导致的法律风险。
五、扩展优化方向
-
批量商品采集:支持多商品ID批量采集,结合异步请求池提升效率,生成行业竞品对比报告;
-
价格趋势预测:基于APP端历史价格数据,通过时序分析模型预测商品价格走势,辅助最优采购时机决策;
-
多店铺商品监控:支持多微店店铺商品数据采集,实现店铺商品矩阵分析与竞争力排名;
-
可视化报表升级:集成matplotlib/seaborn生成竞争力评分雷达图、价格趋势折线图、促销力度对比柱状图;
-
AI智能分析:引入大模型实现商品标题语义分析,自动提取核心卖点与用户需求,辅助产品选型与运营决策。
本方案突破了传统微店商品详情接口采集的技术瓶颈,实现了从多端接口适配、Token动态管理、全量数据采集到商业价值挖掘的全链路优化,可作为下沉市场电商运营、竞品分析、供应链规划的核心技术支撑,同时严格遵循合规要求,兼顾技术可行性与法律风险控制。
