结论是:
videoid = tt-videoid
video_json = f'https://i.snssdk.com/video/urls/1/toutiao/mp4/{videoid}'
分析步骤为:
目标就是这个文章链接下面的,视频 下载下来。
抓包分析
看到了 video 标签, 但是源码里面没有这个标签,发现了有个很有标识的东西: tt-videoid 。
直接说 结果: src="https://i.snssdk.com/video/urls/1/toutiao/mp4/v0301fg10000csljvd7og65kf0m8ntq0?callback=tt__video__vdefy8"
直接请求 src 得到了,
看到这种 不要慌,解密一下。
好,结果就出来了
解析代码
python
videoid = tt-videoid
video_json = f'https://i.snssdk.com/video/urls/1/toutiao/mp4/{videoid}'
headers = {
"accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7",
"accept-language": "zh-CN,zh;q=0.9",
"cache-control": "no-cache",
"pragma": "no-cache",
"priority": "u=0, i",
"sec-ch-ua": "\"Google Chrome\";v=\"129\", \"Not=A?Brand\";v=\"8\", \"Chromium\";v=\"129\"",
"sec-ch-ua-mobile": "?0",
"sec-ch-ua-platform": "\"Windows\"",
"sec-fetch-dest": "document",
"sec-fetch-mode": "navigate",
"sec-fetch-site": "none",
"sec-fetch-user": "?1",
"upgrade-insecure-requests": "1",
"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36"
}
cookies = {
"odin_tt": "a2d3ceea612f2780e30a937100bd7f046490e1444ea859921535a09461162ba4175263d58898dd877d347b78eb144dbd6cd7bb127969860c77836bbeca4e6fe5"
}
response = requests.get(video_json, headers=headers, cookies=cookies)
json_data = json.loads(response.text)
main_url = jsonpath(json_data, '$..video_3.main_url')[0] if jsonpath(json_data, '$..video_3.main_url') else None
if main_url:
decoded_bytes = base64.b64decode(main_url)
video_url = decoded_bytes.decode('utf-8')
结束:
文章视频可以正常提取下载