打造自己的 Jar 文件分析工具:类名匹配 + 二进制搜索 + 日志输出全搞定

版权归作者所有,如有转发,请注明文章出处:cyrus-studio.github.io/blog/

前言

在逆向分析、APK 解包或 Java 工程排查中,我们常常需要检查某个 .class 文件是否存在于某个 JAR 包中,或者判断某个关键字是否被硬编码在类的字节码里。

如何通过编写 Python 脚本,帮助你在 JAR 文件中快速定位类、字段,甚至二进制内容。

类路径匹配(路径前缀)

JAR 文件本质是 ZIP 文件,通过 zipfile.ZipFile 解压后可遍历所有文件名(也就是 .class 路径),我们只需检查是否以某个前缀开头即可。

python 复制代码
with zipfile.ZipFile(jar_path, 'r') as jar:
    for entry in jar.namelist():
        if entry.endswith(".class") and entry.startswith(class_prefix_path):
            print(f"[✓] Found in: {jar_path} → {entry}")
            found.append((jar_path, entry))

字节码字段查找(二进制匹配)

通过读取 .class 文件的二进制数据,可以判断是否存在某个硬编码字符串。比如我们要查找 "VERSION_NAME" 是否被写入类的常量池中,就可以用这种方式。

python 复制代码
with zipfile.ZipFile(jar_path, 'r') as jar:
    for entry in jar.namelist():
        if entry.endswith(".class"):
            try:
                with jar.open(entry) as class_file:
                    content = class_file.read()
                    if keyword.encode() in content:
                        print(f"[✓] Found '{keyword}' in {entry} → {jar_path}")
                        found.append((jar_path, entry))
            except Exception as e:
                print(f"[!] Failed reading {entry} in {jar_path}: {e}")

注意:这是字节级别的搜索,类似 strings 工具。

路径与内容联合搜索(双重匹配)

同时检查路径与二进制内容,适合用于广泛关键词搜索。

python 复制代码
# ① 类名路径中包含关键字
if keyword_path in entry:
    print(f"[✓] Keyword in class name: {entry} ({jar_path})")
    matched = True

# ② 字节码中包含关键字(如字符串常量)
try:
    with jar.open(entry) as class_file:
        content = class_file.read()
        if keyword_bin in content:
            print(f"[✓] Keyword in class bytecode: {entry} ({jar_path})")
            matched = True
except Exception as e:
    print(f"[!] Failed reading {entry} in {jar_path}: {e}")

编写命令行工具入口

使用 argparse.ArgumentParser 创建参数解析器:

1、directory:要搜索的目录路径,传入的是目录中含有 .jar 文件的位置。

2、keyword:搜索关键字,用于匹配类路径(如 com/example/MyClass)或字节码中的字段内容(如某个字符串、变量名等)。

3、--mode:搜索模式,默认是 "class",也可以指定为:

  • "class":只搜索类路径名中是否包含关键字。

  • "field":只搜索 .class 文件中的字段、方法等内容(二进制搜索)。

  • "all":两者都搜。

ini 复制代码
if __name__ == "__main__":
    import argparse

    parser = argparse.ArgumentParser(description="Search for class name or class content keyword in JAR files.")
    parser.add_argument("directory", help="Directory to search")
    parser.add_argument("keyword", help="Class prefix or bytecode keyword")
    parser.add_argument("--mode", choices=["class", "field", "all"], default="class",
                        help="Search mode: 'class' (class path), 'field' (bytecode), 'all' (both)")

    args = parser.parse_args()

    if args.mode == "class":
        find_class_in_jars(args.directory, args.keyword)
    elif args.mode == "field":
        find_field_in_jars(args.directory, args.keyword)
    elif args.mode == "all":
        find_class_and_content_in_jars(args.directory, args.keyword)

使用示例

1. 查找类

查找类是否存在,比如 com.bytedance.retrofit2.SsResponse

kotlin 复制代码
(anti-app) PS D:\Python\anti-app\dex2jar> python find_in_jars.py "D:\Python\anti-app\app\douyin\dump_dex\jar" "com.bytedance.retrofit2.SsResponse"
[+] Searching for class prefix: com/bytedance/retrofit2/SsResponse
[✓] Found in: D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk_classes28.jar → com/bytedance/retrofit2/SsResponse.class
[+] Total 1 match(es) found.

支持模糊查找,比如查找 com.bytedance.ttnet 包下所有类

bash 复制代码
(anti-app) PS D:\Python\anti-app\dex2jar> python find_in_jars.py "D:\Python\anti-app\app\douyin\dump_dex\jar" "com.bytedance.ttnet."
[+] Searching for class prefix: com/bytedance/ttnet/
[✓] Found in: D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk_classes2.jar → com/bytedance/ttnet/TTNetInit$ENV.class
[✓] Found in: D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk_classes23.jar → com/bytedance/ttnet/debug/DebugSetting.class
[✓] Found in: D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk_classes37.jar → com/bytedance/ttnet/http/HttpRequestInfo.class
[✓] Found in: D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk_classes37.jar → com/bytedance/ttnet/http/RequestContext.class
[✓] Found in: D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk_classes37.jar → com/bytedance/ttnet/HttpClient.class
[✓] Found in: D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk_classes37.jar → com/bytedance/ttnet/ITTNetDepend.class
[✓] Found in: D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk_classes37.jar → com/bytedance/ttnet/TTALog.class
[✓] Found in: D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk_classes37.jar → com/bytedance/ttnet/TTMultiNetwork.class
[✓] Found in: D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk_classes37.jar → com/bytedance/ttnet/TTNetInit.class
[✓] Found in: D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk_classes37.jar → com/bytedance/ttnet/clientkey/ClientKeyManager.class
[✓] Found in: D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk_classes37.jar → com/bytedance/ttnet/config/AppConfig.class
[✓] Found in: D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk_classes37.jar → com/bytedance/ttnet/config/TTHttpCallThrottleControl$DelayMode.class
[✓] Found in: D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk_classes37.jar → com/bytedance/ttnet/config/TTHttpCallThrottleControl.class        
[✓] Found in: D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk_classes37.jar → com/bytedance/ttnet/cronet/AbsCronetDependAdapter.class
[✓] Found in: D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk_classes37.jar → com/bytedance/ttnet/diagnosis/TTNetDiagnosisService.class
[✓] Found in: D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk_classes37.jar → com/bytedance/ttnet/priority/TTHttpCallPriorityControl$ModeType.class
[✓] Found in: D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk_classes37.jar → com/bytedance/ttnet/priority/TTHttpCallPriorityControl.class      
[✓] Found in: D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk_classes37.jar → com/bytedance/ttnet/retrofit/SsInterceptor.class
[✓] Found in: D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk_classes37.jar → com/bytedance/ttnet/retrofit/SsRetrofitClient.class
[✓] Found in: D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk_classes37.jar → com/bytedance/ttnet/throttle/TTNetThrottle.class
[✓] Found in: D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk_classes37.jar → com/bytedance/ttnet/tnc/TNCManager$TNCUpdateSource.class
[✓] Found in: D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk_classes37.jar → com/bytedance/ttnet/tnc/TNCManager.class
[✓] Found in: D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk_classes37.jar → com/bytedance/ttnet/utils/RetrofitUtils$CompressType.class        
[✓] Found in: D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk_classes37.jar → com/bytedance/ttnet/utils/RetrofitUtils.class
[✓] Found in: D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk_classes47.jar → com/bytedance/ttnet/diagnosis/IDiagnosisRequest.class
[✓] Found in: D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk_classes47.jar → com/bytedance/ttnet/diagnosis/IDiagnosisCallback.class
[✓] Found in: D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk_classes47.jar → com/bytedance/ttnet/diagnosis/TTGameDiagnosisService.class        
[✓] Found in: D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk_classes49.jar → com/bytedance/ttnet/http/IRequestHolder.class
[✓] Found in: D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk_classes7.jar → com/bytedance/ttnet/INetworkApi.class
[+] Total 29 match(es) found.

2. 查找类字节码

查找类字节码中是否包含指定字段(如 VERSION_NAME)

csharp 复制代码
(anti-app) PS D:\Python\anti-app\dex2jar> python find_in_jars.py "D:\Python\anti-app\app\douyin\dump_dex\jar" VERSION_NAME --mode field
[✓] Found 'VERSION_NAME' in com/bykv/vk/openvk/api/proto/BuildConfig.class → D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk.jar
[✓] Found 'VERSION_NAME' in com/byted/cast/proxy/BuildConfig.class → D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk.jar
[✓] Found 'VERSION_NAME' in com/ss/ttm/player/TTPlayerConfiger.class → D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk_classes9.jar
. . . . . .
[+] Total 128 matches found.

3. 类路径与字节码联合搜索

同时查找类路径和字节码是否包含关键词

kotlin 复制代码
(anti-app) PS D:\Python\anti-app\dex2jar> python find_in_jars.py "D:\Python\anti-app\app\douyin\dump_dex\jar" Retrofit --mode all                   
[+] Searching for class path or class bytecode containing: Retrofit
[✓] Keyword in class bytecode: X/01Ek.class (D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk.jar)
[✓] Keyword in class name: com/bytedance/android/live/broadcast/api/BroadcastConfigRetrofitApi.class (D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk.jar)
. . . . . .
[✓] Keyword in class bytecode: X/0ppk.class (D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk_classes27.jar)
[✓] Keyword in class bytecode: kotlin/jvm/internal/ALambdaS879S0100000_16.class (D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk_classes9.jar)
[+] Total 1639 match(es) found.
[+] Matched JAR count: 49
[+] Matched JAR files:
    - D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk.jar
    - D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk_classes2.jar
    - D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk_classes3.jar
    - D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk_classes4.jar
    - D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk_classes5.jar
    - D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk_classes6.jar
    - D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk_classes7.jar
    - D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk_classes8.jar
    - D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk_classes9.jar
    - D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk_classes10.jar
    . . . . . .

将日志同时输出到 终端和日志文件

为了将日志同时输出到 终端和指定的日志文件,可以引入 Python 标准库的 logging 模块,并通过一个命令行参数 --logfile 来控制输出日志文件路径。

1. 顶部导入 logging 模块

arduino 复制代码
import logging

2. 添加日志初始化函数

在文件顶部添加如下函数来配置日志输出:

python 复制代码
def setup_logger(logfile: str = None):
    """
    设置日志输出,可选输出到文件。
    :param logfile: 日志文件路径(可选)
    """
    log_format = "[%(asctime)s] %(message)s"
    logging.basicConfig(
        level=logging.INFO,
        format=log_format,
        handlers=[
            logging.StreamHandler(),  # 控制台输出
            logging.FileHandler(logfile, mode='w', encoding='utf-8') if logfile else logging.NullHandler()
        ]
    )

3. 替换 print(...) 为 logging.info(...) 或 logging.warning(...)

例如:

python 复制代码
print(f"[+] Searching for class prefix: {class_prefix_path}")

替换为:

python 复制代码
logging.info(f"Searching for class prefix: {class_prefix_path}")

4. 修改 main 中增加参数并初始化日志

bash 复制代码
parser.add_argument("--logfile", help="Log output to specified file (optional)")

然后在解析参数之后调用:

scss 复制代码
setup_logger(args.logfile)

测试

调用时在调用命令后加上 --logfile log.txt

c 复制代码
python find_in_jars.py "D:\Python\anti-app\app\douyin\dump_dex\jar" "com.bytedance.retrofit2.SsResponse" --logfile log.txt

终端和 log.txt 都会输出日志信息。

ini 复制代码
(anti-app) PS D:\Python\anti-app\dex2jar> python find_in_jars.py "D:\Python\anti-app\app\douyin\dump_dex\jar" "com.bytedance.retrofit2.SsResponse" --logfile log.txt
[2025-07-20 22:27:57,695] [+] Searching for class prefix: com/bytedance/retrofit2/SsResponse
[2025-07-20 22:27:58,796] [✓] Found in: D:\Python\anti-app\app\douyin\dump_dex\jar\base.apk_classes28.jar → com/bytedance/retrofit2/SsResponse.class
[2025-07-20 22:28:00,267] [+] Total 1 match(es) found.

log.txt

完整源码

python 复制代码
import logging
import os
import re
import zipfile
from typing import List


def setup_logger(logfile: str = None):
    """
    设置日志输出,可选输出到文件。
    :param logfile: 日志文件路径(可选)
    """
    log_format = "[%(asctime)s] %(message)s"
    logging.basicConfig(
        level=logging.INFO,
        format=log_format,
        handlers=[
            logging.StreamHandler(),  # 控制台输出
            logging.FileHandler(logfile, mode='w', encoding='utf-8') if logfile else logging.NullHandler()
        ]
    )


def find_class_in_jars(directory, class_prefix):
    """
    查找所有包含指定类名前缀的 .class 文件(支持包名或类名前缀匹配)

    :param directory: 要扫描的目录
    :param class_prefix: 类名或包名前缀(如 com.example. 或 com.example.MyClass)
    """
    if not class_prefix:
        logging.info("[-] Class name prefix cannot be empty.")
        return

    # 将类名转换为 JAR 中的路径格式(例如 com.example. → com/example/)
    class_prefix_path = class_prefix.replace('.', '/')

    logging.info(f"[+] Searching for class prefix: {class_prefix_path}")
    found = []

    for root, _, files in os.walk(directory):
        for file in files:
            if file.endswith(".jar"):
                jar_path = os.path.join(root, file)
                try:
                    with zipfile.ZipFile(jar_path, 'r') as jar:
                        for entry in jar.namelist():
                            if entry.endswith(".class") and entry.startswith(class_prefix_path):
                                logging.info(f"[✓] Found in: {jar_path} → {entry}")
                                found.append((jar_path, entry))
                except zipfile.BadZipFile:
                    logging.info(f"[!] Skipping corrupted jar: {jar_path}")

    if not found:
        logging.info("[-] No matching class found.")
    else:
        logging.info(f"[+] Total {len(found)} match(es) found.")


def find_field_in_jars(directory, keyword):
    """
    在指定目录下所有 jar 文件中查找包含指定字段的类(.class)文件

    :param directory: 待扫描目录路径
    :param keyword: 要查找的字段字符串(如 VERSION_NAME)
    """
    found = []

    for root, _, files in os.walk(directory):
        for file in files:
            if file.endswith(".jar"):
                jar_path = os.path.join(root, file)
                try:
                    with zipfile.ZipFile(jar_path, 'r') as jar:
                        for entry in jar.namelist():
                            if entry.endswith(".class"):
                                try:
                                    with jar.open(entry) as class_file:
                                        content = class_file.read()
                                        if keyword.encode() in content:
                                            logging.info(f"[✓] Found '{keyword}' in {entry} → {jar_path}")
                                            found.append((jar_path, entry))
                                except Exception as e:
                                    logging.info(f"[!] Failed reading {entry} in {jar_path}: {e}")
                except zipfile.BadZipFile:
                    logging.info(f"[!] Bad JAR file: {jar_path}")

    if not found:
        logging.info(f"[-] No classes containing '{keyword}' found.")
    else:
        logging.info(f"\n[+] Total {len(found)} matches found.")

    return found


def sort_jar_paths(jar_paths: List[str]) -> List[str]:
    """
    对包含 base.apk、base.apk_classesN.jar 的路径列表进行排序,确保 _classes2 排在 _classes10 前面。

    :param jar_paths: 未排序的 jar 文件路径列表
    :return: 排序后的 jar 文件路径列表
    """

    def extract_index(path: str) -> int:
        """
        提取路径中 _classesN 的 N 数字部分用于排序。
        如果是 base.apk.jar 则返回 0,表示优先排序。
        """
        match = re.search(r'_classes(\d+)\.jar$', path)
        if match:
            return int(match.group(1))  # 提取 _classesN 中的 N
        return 0  # base.apk.jar 没有 _classesN,默认最小值

    # 按照提取出的数字索引进行排序
    return sorted(jar_paths, key=extract_index)


def find_class_and_content_in_jars(directory, keyword):
    """
    在指定目录下所有 JAR 中搜索:
    1. 类路径中包含关键字的类名
    2. 类的字节码中包含关键字内容

    :param directory: 要搜索的目录
    :param keyword: 要查找的关键字(支持类名路径或内容关键字)
    """
    if not keyword:
        logging.info("[-] 关键词不能为空")
        return

    logging.info(f"[+] Searching for class path or class bytecode containing: {keyword}")

    keyword_bin = keyword.encode()  # 转为二进制用于内容匹配
    keyword_path = keyword.replace('.', '/')

    matched_entries = []
    matched_jars = set()

    for root, _, files in os.walk(directory):
        for file in files:
            if file.endswith(".jar"):
                jar_path = os.path.join(root, file)
                try:
                    with zipfile.ZipFile(jar_path, 'r') as jar:
                        for entry in jar.namelist():
                            if not entry.endswith(".class"):
                                continue

                            matched = False

                            # ① 类名路径中包含关键字
                            if keyword_path in entry:
                                logging.info(f"[✓] Keyword in class name: {entry} ({jar_path})")
                                matched = True

                            # ② 字节码中包含关键字(如字符串常量)
                            try:
                                with jar.open(entry) as class_file:
                                    content = class_file.read()
                                    if keyword_bin in content:
                                        logging.info(f"[✓] Keyword in class bytecode: {entry} ({jar_path})")
                                        matched = True
                            except Exception as e:
                                logging.info(f"[!] Failed reading {entry} in {jar_path}: {e}")

                            if matched:
                                matched_entries.append((jar_path, entry))
                                matched_jars.add(jar_path)

                except zipfile.BadZipFile:
                    logging.info(f"[!] Skipping corrupted jar: {jar_path}")

    if not matched_entries:
        logging.info(f"[-] No match found for keyword '{keyword}'")
    else:
        logging.info(f"\n[+] Total {len(matched_entries)} match(es) found.")
        logging.info(f"[+] Matched JAR count: {len(matched_jars)}")
        logging.info("[+] Matched JAR files:")
        for jar_file in sort_jar_paths(matched_jars):
            logging.info(f"    - {jar_file}")


if __name__ == "__main__":
    r"""
    示例用法(支持按类路径、类字段内容或同时匹配进行搜索):

    1. 按类路径查找(是否包含某类):
        python find_in_jars.py "D:\Python\anti-app\app\douyin\dump_dex\jar" com.bytedance.retrofit2.SsResponse

       支持包名前缀模糊查找:
        python find_in_jars.py "D:\Python\anti-app\app\douyin\dump_dex\jar" com.bytedance.ttnet.

    2. 按字节码内容查找(如字符串常量、字段名等):
        python find_in_jars.py "D:\Python\anti-app\app\douyin\dump_dex\jar" VERSION_NAME --mode field

    3. 同时查找类路径和字节码中是否包含关键词:
        python find_in_jars.py "D:\Python\anti-app\app\douyin\dump_dex\jar" com.bytedance.retrofit2.Retrofit --mode all

    4. 输出结果到日志文件(可与以上任意命令组合):
        python find_in_jars.py "D:\Python\anti-app\app\douyin\dump_dex\jar" com.bytedance.ttnet. --mode all --logfile log.txt
    """
    import argparse

    parser = argparse.ArgumentParser(description="Search for class name or class content keyword in JAR files.")
    parser.add_argument("directory", help="Directory to search")
    parser.add_argument("keyword", help="Class prefix or bytecode keyword")
    parser.add_argument("--mode", choices=["class", "field", "all"], default="class",
                        help="Search mode: 'class' (class path), 'field' (bytecode), 'all' (both)")
    parser.add_argument("--logfile", help="Log output to specified file (optional)")

    args = parser.parse_args()

    # 初始化日志
    setup_logger(args.logfile)

    if args.mode == "class":
        find_class_in_jars(args.directory, args.keyword)
    elif args.mode == "field":
        find_field_in_jars(args.directory, args.keyword)
    elif args.mode == "all":
        find_class_and_content_in_jars(args.directory, args.keyword)

开源地址:github.com/CYRUS-STUDI...

相关推荐
_一条咸鱼_几秒前
Android Runtime敏感数据加密存储源码级解析(89)
android·面试·android jetpack
_一条咸鱼_几秒前
Android Runtime编译优化深度解析(90)
android·面试·android jetpack
用户2018792831675 分钟前
字符串王国的清洁大作战:去除特殊字符的奇幻冒险
android
用户566781918177 分钟前
Java中的依赖注入和控制反转
java
小码哥_常9 分钟前
Android内存泄漏:从“捉虫”到“驯兽”的开发者指南
android·kotlin
蓝染yy12 分钟前
Maven
java·maven
@十八子德月生16 分钟前
第二阶段-第二章—8天Python从入门到精通【itheima】-133节(SQL——DQL——基础查询)
android·开发语言·数据库·python·sql·学习·程序人生
半桔26 分钟前
【烧脑算法】拓扑排序:从“依赖”到“序列”,理解题目中的先后逻辑
java·数据结构·c++·算法·leetcode·拓扑学
Java中文社群31 分钟前
面试官:Dify如何调用外部程序?
java·后端·面试
天天摸鱼的java工程师44 分钟前
Java 程序员的 Linux 修炼手册:命令行下的江湖生存指南 🐧☕️
java·后端·面试