Chatgpt 的出现极大地提升了程序员的工作效率,常见的使用场景包括代码自动生成、代码静态检查等,那么 chatgpt 能否用于某些网站的数据破解工作呢?
问题
某天线上服务开始报警,原来是某个视频网站无法获取到其 cdn 地址导致的下载失败问题。
经过 debug 发现原来的明文数据现在变成了加密数据(数据放在 html 中),如下
由于职责所在,需要对此加密数据进行破解
解决方案
通过搜索,可以定位到相关 js 代码位置,点击 Open in Sources panel,可以进行断点调试
结合断点和变量输出,我们发现核心的解密函数逻辑是 Object(le.videoDataDecrypt),点击可以继续跟踪
这下有点傻眼了,这都是什么啊
不要急,让我们翻译翻译
那么,解密逻辑
javascript
return e = a[t(540)][t(737)][t(577)](a[t(540)][t(506)][t(582)](e)), (0, o[t(782)])(e, t(747))
等价于
javascript
e = a['enc']['Utf8']['stringify'](a['enc']['Base64']['parse'](e));
return (0, o['xorCipher'])(e, 'guanhui456')
我们观察到,`a['enc']['Base64']['parse']` 看起来像是 base64 解码,经过尝试发现确实是;
而 `a['enc']['Utf8']['stringify']`,容易获得其源码为
javascript
stringify: function(e) {
var t = l;
try {
return decodeURIComponent(escape(v[t(577)](e)))
} catch (r) {
throw new Error(t(754))
}
}
所以,解密逻辑变成了
javascript
e = decodeURIComponent(escape(base64decode(e)))
return (0, o['xorCipher'])(e, 'guanhui456')
`(0, o['xorCipher'])` 的逻辑可以直接通过 chatgpt 进行理解和生成
结合上面这些分析,可以得到测试的 python 代码如下
python
import base64
from urllib.parse import unquote, quote
e = 'HFcICkVSV1gEBAVXRlNZVV9CXQcADlBHVVteSllLQFxCCxBDVEU0AFEHVwQ7AFQMUlgpHA0FAAMpFFkEUEw1QQFTVhE9G18MR1FoQAFSTFAyEg4TWVdpQ1EWWAg7HUxZBVRqEkNUC1c0AF8BAQ47AFdYUVApHAIDAF8pFFZRXxE1QQMBV0w9G1BYF1BoQAAERFUyEl5EWwBpQwETUQ07HUEPAANqEkBYWV40AFwHAgI7AFkMAVwpHABQDlEpFFYBDEw1QQMOVEI9G1MNRVloQAFSQwAyElAXDVAVFTsAV1xWWSkcAwMOUykUWgJeRDVBAwFSRD0bUF1ED2hAAANGA05ENABcBQxVOwBXWVJZKRwDAAVSKRRbUlxDSxgXRggGFQsVSk9LXEFCFwZbMkg0Wg8DG1QGHAUbSQsaBGgaXxMpThtaXUFfAgMEXk1UQlVcQFEABA5SRldIAQVIWwYHEAYFEVNWWENPUgh8NzAmLhQ3RUVDag5XRU0GOFlFWQQZRzhNUUIBNxQcQFoUS1cRAgYRFgdAFwxVQU1MAQUFBVVMVQkBQ1RFWkE1QQNUVUQ9G1FcFA1oQABSRgRMS0oFHFZZXxQdPhoOBRBLDgQAXkRXW1RbR1kYF1IGAQRMXUopHAEGUlYpFFsCWEY1QQMDAUM9G15dE11oQFABRABcV1pGNUEAU1BBWTISXkJZDAQGOwBXWwJdV0UWQV8KED4CAgYSHVwXDEVFVFRVUFdFFlFDFRQVBwgGV1MHBw5LV1cdOBgZCE1qQxUZQ1RFAAEdRA9qSClOBg4PHQVdUl4TWBcHAw0aR1dRWEkXAgsFBwZHV1pbO1oXBwMNGjUbA0U7WlQIBA5FWg0NG1RCVwpKWUQMURhXV0RWQ1ELTFtWUw5RQ1IKUUYYGQAKQDgTEwEKNwZUXF5XFwVMBgYHHghaGF4FE0NCRRgZCE1ARAtXW0wPHAEZRw9qSClOGANaWwtQRkIGAQgNSQsaBGgaWwMUTB4PUREaDUBbDRIVDV8ZHwdoGgVRRREySABHXwBpGVZDWF9RXkVfBQcGVUZZXlZZQF9oGlsDFEweD1ERGg1AWw0SFQ1fGR8HGlhGU0oXMQEaGgRrRgsPHgAeF0UdCFteVwlYCQwBThQcQF1pDBAYU1ZeTFgCDQRRTFJDV0VFRAwNDwMUVlhUUEJYDFAEX0xUD15aQVBVVwNfQwNYBg1FT1ZWUxEaBTEEABQHWlBaWgYEDxULHQtbTWkBEAQKQQsHVAQTVQNIUUgXDEhYEkVCWkZHAggPHA0JBgZeRlRcU1BGURJDXwNIUF5WW0VaBgIEUEJVXVJQQlsAAA9FWUMNCwkHAEBMYxUZQ1Q8E1cCUUwUXVcSCkVEVxtVW11FT1FCRRwcHVhQFF1XPRtRUEVeaEAAAkVUTEtKABtYFwxFHRUaFxtPNRtpGRERU0AFDAYdVUFfBFsCAQo0WgRQVBsXHVgKFFEABF5SQgRNEAQJNFpaAgVGO1oJXFFcKUYFAw9WQ1deUVlHWQYGDldEUFtRNFoEUFQbFx1YChRRAAReUkIETRAECUYYGQAKQDgTEwEKNwZUXF5XFwVMBgYHHghaGF4FE0cPEhwdNl9QT1pEV1dWXk1bAgwFSkVMXkpQTVBQVAFRRllZVlAQWwwMAwZMU1peCRdcDANUURQEXkEKFgxCWlI4FgkPCQYQBQlGUwYHAgYFBw02UlBTA1MCHFpYUwpQCAZBBQVTVk4FHQkGEAsaBgcDVUZZDQYDVUFZXV9OAwBQCAdXRFJeVFpCWwMCAlRAWVlVXEBQFhkUERwFCwgqBRoWDwJWQk1MEQcRP11RUwg9NkxdSkZfBBESUUFRTEtKAwBQUFk0HBsLRVJEXxoCGkUDDgoqBxofZ1xMAldbX1JeTVsDSBocVwoLHkpPS1xRFEtXEw8JA1dTBRkUExwVAgJKT0toQA8GEVkyEl4QWQEXGkUAEwJFUlcBQEFGFE89QTtHAw0GG1QDBhUPEwEWR1daWztaDAoGRQUBDVFFXgAMBAAcFlFFX1g7WlRZURgpRlwHAFMpTl9RUURfAgUAVkdQXlJaQ1wCBgY7WgwKBkUFAQ1RRV4ADAQAHBZRRV9YSRgRWlgeKg9GWls4BlwGDAkFGRldVwgeAABKABcPElRDEx0+BQIRSFgCDAdRTVNYXltYWRkFGwETVF5UDkcMAQVTVUBYC1cNE1pQBg5fQwRaBApMWAcBEAUWBBgIDCoKXFRYCRANUxQNFBtXXVQIDT4IAg0RT1dHC1ZTAgpaWFMZUAgHQQUVU1ROGQZTXFJaRlFXVF1HXQwGDkEDCApaWUVYBwUFVUJTWVBcRlwMAgRTQFhMS0oDAFBQWSUFEkxdXkZaGBdACBE3BwMNGiFjFwxFQFZYQ0xEWQYBFEtXFwcDDRo6XU9TRU9TW0lcWUtCWlIqGg4YNAEPDBYPB1JDVF5VFVkSFl5THldbTBQLV0UWR1cJHkNUVURXHV1BWgJXW0w7HU0NDABqEkMEXlJKWUtBR1pFT0MGExwFGg5pGTtaFwpVRhcNR0FXExwCQAQHGDUbWFIGWBEGXgwGUEFYXAABAlYWAhs1GwIEVwU9QQ9aQ11oGgdRTFBYUVhDWAYFBVNMVVdRXUc1G1hSBlgRBl4MBlBBWFwAAQJWFgIbR1lFAlgDPggVBxg2RwheDBQRHkoAFAZfVFhKHQMIQQkAHVxqXQIMXF9RUURfDAcAXkZMXkpYWFBXBlRVEwJbVwoUW1UABFQWUg8DC0NdBVEEBEFUWlNRUwtXUEAIET4NDwkbB1FZCxQQABwEABcGTGpQAhAFSAQaSFsSVlJaRUceA1VET0RBC1RTDQEAARFUBwUPVEBTWl9bTU9CXFJaRFFfVFhGWwMHAVBBUltfX0ddAQwUS1cXBwMNGitERhRdTFRXS0oDBlBjXwMQDiYwSk9LAwcGQ1FQXF9YV0UWQ18DEA49DhIQSw4GDklBTUwRBxEkW1pANBwbC0VSRFwCAw9TCDxCRR4cDVFaaRQBABoSG1dTBBkUERwFCwg3BwxeUFUTIRgeAkpPB0FZWktXCB04BBwCURcMV1lDBxQ3FgZYWVMEAQgBCUpPWRgXWg4eBExdW1lLXUZpFxQYMQQHGRxZWxRdRU1MFAsQB1FGFF0TAAIUDVlLQlpCAjEIHQYKGQx3QUQLV1tMV0pZS1JYWg4eBDEJHRhLDhcFRVlDDQgFGAxaQRRdRU1MAQUWBllYUwkBPgASBVdTFgVqEkMDXFY0AF8AVFI7AFdbVA1XRRZWWQsADAA4ARsPWxcMPChNTBcaEB9dUEEyBw0mExwFSw4XXhMBEVQ7RylGXFxRDxkICQ8cWB9dUVMIWwIKCUYXClFXWRRbAgEKNFofXVFTCClOWBQ0WlxSVlBXRlhWSltCX1AYB1YQBEMGWEReGQNVXkcDCF9eQ1pQAxgKBVVRETcTG1tYaRRICQUGGAVEXFRZDBQPQw8KE0sYF18UOQ4AAD4cDVFaFF0TAAIUDVlLUkdXChA+HRcaHB1RRhRdDkMHCgkSDGtARAsGQ1Q8Sh0dQEUMO1o9QQABBRoFG1QGHAUbSQsaBGgaXxMpThtaW0NfBw0GVkxVVktRRV0DBgRfQldIAQVIWgQHB0EUER5aW0VbBRNQWj8xKyBXAlQGAwdXUwlTVlxCWRYZFA8BFR5dNFo1G1JfFwZRQAUJHA1BG1UIGD1BDhwpRkEIBVVAUl5VUU1YBBkEUEJQWlVfTVsNE1AKSFJeVVlTCERFC1RFU19BDkgjZHBxWAJcXFFZRU9cCAdTQlFMS0odHUBFDDtaPUEAAQUaBBtUBhwFG0kLGgRoGl8TKU4bWllFWQwND1dHU19LWUNcDQcOX0NRXUEOGFQHBQRWUwAeF1VGWQYEEAFIKz4iL0oeCQcAVkVHBlpZQV4EFxpFHRUaF1IpRmgaUQ4FEl5JChQAUEAYBBoMMkgBATUbQAtUQVhYVllAWAQMGlZEWF5SW0ZcAAYQARhcXVdaRE9VRUZaRlFcVk4TVH5lcyBKFlNVXkRZEl0LVkFWXkU1WUtdW0ICBxcPC0pPWBgXTjgZBABFUkRZGBdOOAYIFAJKT1sCBBpFDD4CAgZXUwUFGkUMPh0OEhBLDgQCUAhNTA4bKg9bR1QOEQULCTcGAVVHU0VPUUJFGx0IRlB/CRMOTF0TVx1NRVNFT0MbFQRXFBgXVQgZDQsEHBwGWmpfCRMOTF0zKBQ='
def simple_decode(data):
# 先进行 base64 解码
data = base64.b64decode(data)
# 再进行 escape 和 decodeURIComponent 操作
data = unquote(quote(data), encoding='utf-8')
return data
def xor_strings(data, key):
n = ""
for i in range(len(data)):
o = ord(data[i])
s = ord(key[i % len(key)])
n += chr(o ^ s)
return n
print(xor_strings(simple_decode(e), 'guanghui456'))
总结
chatgpt 可以用在 javascript 破解的两个方面
- 代码理解:对于一些难以读懂的 javascript 代码,可以让 gpt 辅助进行理解。有些可能是已经成熟的算法,chatgpt 可以识别出来
- 代码改写:将 javascript 代码改写为其他语言,例如 python 或 java,便于集成进业务代码中
写在最后
实际第一次做的时候,完全没有文章中写的这么顺利,那个 base64decode 的代码,也是靠 gpt 猜出来的。。。