京东商品评论API接口系列，json数据返回

以下是京东商品评论采集的API接口分析及对应的JSON数据返回示例，包含网页端实际可用的接口和返回数据结构解析：

一、京东商品评论核心API接口

1. 商品评论列表接口（网页版）

接口地址 ：
https://club.jd.com/comment/productPageComments.action

请求方式 ：GET
请求参数：

bash 复制代码

python
{
    "callback": "fetchJSON_comment98",  # JSONP回调函数名
    "productId": "100012014970",       # 商品ID（必填）
    "score": 0,                        # 评分：0全部，1-5对应星级
    "sortType": 5,                     # 排序：5推荐排序，6时间排序
    "page": 0,                         # 页码（从0开始）
    "pageSize": 10,                    # 每页条数（最大10）
    "isShadowSku": 0,                  # 是否影子SKU
    "fold": 1                          # 是否折叠相同评论
}

返回JSON结构（去除JSONP包装后）：

json 复制代码

json
{
    "productCommentSummary": {
        "goodRateShow": 98,           // 好评率百分比
        "commentCount": 50000,        // 总评论数
        "defaultGoodCount": 48000     // 默认好评数
    },
    "hotCommentTagStatistics": [      // 热门标签
        {"id": 1, "name": "物流快", "count": 5000},
        {"id": 2, "name": "质量好", "count": 3000}
    ],
    "comments": [                    // 评论列表
        {
            "id": 123456789,          // 评论ID
            "content": "东西很好，物流很快",  // 评论内容
            "creationTime": "2023-01-01 12:00:00",  // 评论时间
            "score": 5,               // 评分（1-5）
            "usefulVoteCount": 200,   // 有用投票数
            "nickname": "jd_123",     // 用户昵称
            "productColor": "白色",   // 商品颜色
            "productSize": "XL",      // 商品尺寸
            "images": [               // 评论图片（如有）
                {"id": 111, "imgUrl": "//img30.360buyimg.com/n1/s450x450_jfs/..."}
            ]
        }
    ]
}

2. 商品评价统计接口

接口地址 ：
https://club.jd.com/comment/productCommentSummaries.action

请求参数：

bash 复制代码

python
{
    "referenceIds": "100012014970",   # 商品ID（多个用逗号分隔）
    "callback": "fetchJSON_comment98vv333"
}

返回JSON结构：

json 复制代码

json
{
    "CommentsCount": [
        {
            "SkuId": 100012014970,
            "ProductId": 100012014970,
            "CommentCount": 50000,    // 总评论数
            "GoodCount": 48000,       // 好评数
            "GeneralCount": 1500,     // 中评数
            "PoorCount": 500,         // 差评数
            "VideoCount": 200,        // 视频评论数
            "AfterCount": 300,        // 追评数
            "GoodRate": 0.98,         // 好评率
            "GeneralRate": 0.03,      // 中评率
            "PoorRate": 0.01          // 差评率
        }
    ]
}

二、Python实现代码（直接获取JSON）

python 复制代码

python
import requests
import json
import time
import random
 
def get_jd_comments_json(product_id, pages=3):
    base_url = "https://club.jd.com/comment/productPageComments.action"
    headers = {
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36',
        'Referer': f'https://item.jd.com/{product_id}.html'
    }
    
    all_comments = []
    for page in range(pages):
        params = {
            'callback': 'fetchJSON_comment98',
            'productId': product_id,
            'score': 0,
            'sortType': 5,
            'page': page,
            'pageSize': 10,
            'isShadowSku': 0,
            'fold': 1
        }
        
        try:
            response = requests.get(base_url, headers=headers, params=params)
            if response.status_code == 200:
                # 处理JSONP格式
                json_str = response.text[len('fetchJSON_comment98('):-2]
                data = json.loads(json_str)
                comments = data.get('comments', [])
                if not comments:
                    break
                all_comments.extend(comments)
                print(f"已获取第 {page+1} 页，共 {len(comments)} 条评论")
            else:
                print(f"请求失败，状态码: {response.status_code}")
            
            time.sleep(random.uniform(1, 3))  # 防封延迟
        except Exception as e:
            print(f"第 {page+1} 页获取失败: {e}")
    
    return all_comments
 
# 使用示例
if __name__ == "__main__":
    product_id = "100012014970"  # 示例商品ID
    comments = get_jd_comments_json(product_id, pages=2)
    
    # 保存为JSON文件
    with open('jd_comments.json', 'w', encoding='utf-8') as f:
        json.dump(comments, f, ensure_ascii=False, indent=2)
    print(f"共采集 {len(comments)} 条评论，已保存到 jd_comments.json")

三、关键字段说明

字段名	说明
`content`	评论正文内容
`creationTime`	评论时间（格式：YYYY-MM-DD HH:mm:ss）
`score`	评分（1-5星）
`productColor`	购买的商品颜色
`productSize`	购买的商品尺寸
`images`	用户上传的评论图片列表
`usefulVoteCount`	其他用户认为该评论有用的投票数

四、注意事项

请求限制：
- 单IP高频请求会触发验证码
- 建议每页请求间隔1-3秒
- 最多只能获取前100页评论
数据完整性：
- 默认返回的评论是折叠后的（相同内容可能被合并）
- 如需完整数据，需设置 fold=0
进阶技巧：
- 结合 productCommentSummary 接口获取总体评价分布
- 通过 hotCommentTagStatistics 分析用户关注点
合法性：
- 仅用于个人学习研究
- 商业用途需申请京东开放平台API权限

五、完整数据采集方案

ini 复制代码

python
import pandas as pd
 
def scrape_jd_product_data(product_id):
    # 1. 获取评论统计
    stats_url = "https://club.jd.com/comment/productCommentSummaries.action"
    stats_params = {
        'referenceIds': product_id,
        'callback': 'fetchJSON_comment98vv333'
    }
    stats_response = requests.get(stats_url, params=stats_params)
    stats_json = json.loads(stats_response.text[len('fetchJSON_comment98vv333('):-2])
    
    # 2. 获取评论详情
    comments = get_jd_comments_json(product_id, pages=5)
    
    # 3. 合并数据
    result = {
        'product_id': product_id,
        'statistics': stats_json['CommentsCount'][0],
        'comments': comments,
        'total_comments': len(comments)
    }
    
    return result
 
# 保存为结构化数据
data = scrape_jd_product_data("100012014970")
df = pd.json_normalize(data['comments'])
df.to_excel("jd_comments_analysis.xlsx", index=False)

通过以上方法，您可以系统化地采集京东商品评论数据并进行深入分析。如需更稳定的数据源，建议联系京东商务合作获取官方数据接口。