2.4上、固件安全分析与漏洞挖掘:从提取到逆向的完整实战指南

本文将从实战角度,手把手教你如何对嵌入式设备固件进行安全分析,涵盖从固件提取、文件系统分析到逆向工程挖掘漏洞的全流程。无论你是安全研究员、物联网开发者还是安全爱好者,都能从中获得实用技能。

一. 固件安全:为何如此重要?

1.1 真实世界中的固件漏洞

在开始技术细节之前,让我们先看几个触目惊心的案例:

案例一:路由器后门事件(CVE-2019-19824)

某知名品牌路由器固件被发现存在硬编码后门账户,攻击者可通过特定用户名密码直接获得设备完全控制权。这个漏洞影响超过100万台设备。

案例二:智能摄像头隐私泄露

某物联网摄像头固件中的弱加密实现,导致攻击者可以轻松解密视频流,数百万用户的家庭隐私面临风险。

案例三:工业PLC固件漏洞

某工业控制系统的PLC固件存在缓冲区溢出漏洞,攻击者可远程执行恶意代码,可能导致生产线停工甚至安全事故。

1.2 固件安全威胁全景图

1.3 学习路线图:从零到精通

本文将按照以下路径,带你逐步深入:

固件获取 → 提取解包 → 文件系统分析 → 敏感信息收集 →

逆向工程 → 漏洞挖掘 → 编写Exploit → 防护建议

二. 环境搭建:一键部署分析环境

在开始之前,我们先用Docker快速搭建一个完整的分析环境:

Dockerfile.firmware-analysis

FROM ubuntu:22.04

LABEL maintainer="firmware-security@example.com"

LABEL version="1.0"

LABEL description="固件安全分析一体化环境"

设置时区和非交互式安装

ENV TZ=Asia/Shanghai

ENV DEBIAN_FRONTEND=noninteractive

RUN apt-get update && apt-get install -y --no-install-recommends \

基础工具

build-essential \

git \

curl \

wget \

vim \

nano \

file \

tree \

固件分析核心工具

binwalk \

srecord \

flashrom \

lzop \

u-boot-tools \

mtd-utils \

模拟与调试

qemu \

qemu-user-static \

qemu-system \

gdb-multiarch \

strace \

ltrace \

逆向工程

radare2 \

hexedit \

xxd \

网络分析

net-tools \

netcat \

tcpdump \

nmap \

Python环境

python3 \

python3-pip \

python3-dev \

其他依赖

squashfs-tools \

cpio \

zlib1g-dev \

liblzma-dev \

liblzo2-dev \

&& rm -rf /var/lib/apt/lists/*

安装Python工具

RUN pip3 install --no-cache-dir \

pycryptodome \

scapy \

requests \

beautifulsoup4 \

colorama \

progressbar2

创建工作目录

WORKDIR /workspace

RUN mkdir -p /workspace/firmware /workspace/output /workspace/scripts

安装社区工具

RUN git clone https://github.com/ReFirmLabs/binwalk.git /opt/binwalk && \

cd /opt/binwalk && \

python3 setup.py install

RUN git clone https://github.com/rampageX/firmware-mod-kit.git /opt/firmware-mod-kit && \

cd /opt/firmware-mod-kit && \

./extract-firmware.sh

RUN git clone https://github.com/craigz28/firmwalker.git /opt/firmwalker

下载并安装Ghidra

RUN wget -q https://github.com/NationalSecurityAgency/ghidra/releases/download/Ghidra_10.3_build/ghidra_10.3_PUBLIC_20230829.zip && \

unzip ghidra_10.3_PUBLIC_20230829.zip -d /opt/ && \

rm ghidra_10.3_PUBLIC_20230829.zip && \

ln -s /opt/ghidra_10.3_PUBLIC /opt/ghidra

设置环境变量

ENV PATH="/opt/ghidra:/opt/firmware-mod-kit:${PATH}"

ENV GHIDRA_INSTALL_DIR="/opt/ghidra"

创建快捷启动脚本

RUN echo '#!/bin/bash\njava -jar /opt/ghidra/ghidraRun.jar' > /usr/local/bin/ghidra && \

chmod +x /usr/local/bin/ghidra

设置默认命令

CMD ["/bin/bash"]

使用以下命令构建并运行:

# 构建Docker镜像

docker build -t firmware-analysis:latest -f Dockerfile.firmware-analysis .

# 运行容器,映射本地目录

docker run -it --rm \

-v $(pwd)/firmware:/workspace/firmware \

-v $(pwd)/output:/workspace/output \

-v $(pwd)/scripts:/workspace/scripts \

--name firmware-lab \

firmware-analysis:latest

三. 固件获取:四大途径全解析

3.1 途径一:官方渠道获取

最直接的方式是从设备厂商官网下载:

#!/usr/bin/env python3

"""

固件自动下载脚本

支持多个路由器厂商的固件下载

"""

import requests

import re

import os

from urllib.parse import urljoin

from typing import List, Dict

import logging

class FirmwareDownloader:

"""自动化固件下载器"""

VENDOR_URLS = {

'tplink': 'https://www.tp-link.com.cn/download-center.html',

'asus': 'https://www.asus.com.cn/support/Download-Center/',

'netgear': 'https://www.netgear.com.cn/support/download/',

'dlink': 'http://support.dlink.com.cn/',

'mercury': 'http://service.mercurycom.com.cn/download.html'

}

def init(self, vendor: str, model: str):

self.vendor = vendor.lower()

self.model = model.upper()

self.session = requests.Session()

self.session.headers.update({

'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36'

})

def search_firmware(self) -> List[Dict]:

"""搜索指定型号的固件"""

if self.vendor not in self.VENDOR_URLS:

raise ValueError(f"不支持的路由器品牌: {self.vendor}")

base_url = self.VENDOR_URLS[self.vendor]

response = self.session.get(base_url)

# 不同厂商的页面解析逻辑不同

if self.vendor == 'tplink':

return self._parse_tplink(response.text)

elif self.vendor == 'asus':

return self._parse_asus(response.text)

# 其他厂商解析方法...

def _parse_tplink(self, html: str) -> List[Dict]:

"""解析TP-Link下载页面"""

# 实际解析逻辑会更复杂,这里简化示例

pattern = rf'href="([^"]*{self.model}[^"]*\.(?:bin|zip|rar|7z))"'

matches = re.findall(pattern, html, re.IGNORECASE)

firmwares = []

for match in matches:

firmwares.append({

'url': urljoin(self.VENDOR_URLS['tplink'], match),

'filename': os.path.basename(match),

'model': self.model,

'vendor': 'TP-Link'

})

return firmwares

def download(self, firmware_info: Dict, save_dir: str = './firmware'):

"""下载固件文件"""

os.makedirs(save_dir, exist_ok=True)

url = firmware_info['url']

filename = firmware_info['filename']

save_path = os.path.join(save_dir, filename)

print(f"正在下载: {filename}")

response = self.session.get(url, stream=True)

with open(save_path, 'wb') as f:

for chunk in response.iter_content(chunk_size=8192):

if chunk:

f.write(chunk)

print(f"下载完成: {save_path}")

return save_path

# 使用示例

if name == 'main':

downloader = FirmwareDownloader('tplink', 'WR842N')

firmwares = downloader.search_firmware()

if firmwares:

latest = firmwares[0] # 假设第一个是最新版本

downloader.download(latest)

3.2 途径二:OTA更新抓包

很多IoT设备支持在线更新,我们可以拦截更新请求:

#!/usr/bin/env python3

"""

OTA固件更新包拦截与分析

使用mitmproxy进行中间人攻击

"""

from mitmproxy import http

import re

import json

import hashlib

from pathlib import Path

import datetime

class OTAFirmwareInterceptor:

"""OTA固件拦截器"""

def init(self, output_dir="./ota_captures"):

self.output_dir = Path(output_dir)

self.output_dir.mkdir(exist_ok=True)

self.firmware_patterns = [

r'\.bin', r'\\.img', r'\.fwpkg$',

r'firmware', r'update', r'upgrade'

]

def request(self, flow: http.HTTPFlow):

"""HTTP请求拦截"""

# 检测固件更新请求

if self._is_firmware_request(flow.request):

print(f"[+] 检测到固件更新请求: {flow.request.url}")

def response(self, flow: http.HTTPFlow):

"""HTTP响应拦截"""

if self._is_firmware_response(flow.response):

self._save_firmware(flow)

def _is_firmware_request(self, request) -> bool:

"""判断是否为固件更新请求"""

url = str(request.url).lower()

content_type = request.headers.get('Content-Type', '').lower()

# 检查URL中的关键词

url_matches = any(pattern in url for pattern in ['firmware', 'update', 'upgrade'])

# 检查Content-Type

content_matches = any(pattern in content_type

for pattern in ['octet-stream', 'binary', 'zip'])

return url_matches or content_matches

def _is_firmware_response(self, response) -> bool:

"""判断响应是否为固件文件"""

content_type = response.headers.get('Content-Type', '').lower()

content_disposition = response.headers.get('Content-Disposition', '').lower()

# 常见固件Content-Type

firmware_types = [

'application/octet-stream',

'application/x-firmware',

'application/zip',

'application/x-gzip'

]

# 检查文件名

filename = None

if 'filename=' in content_disposition:

filename = content_disposition.split('filename=')[-1].strip('"\'')

# 多种判断条件

conditions = [

any(ft in content_type for ft in firmware_types),

filename and any(filename.endswith(ext)

for ext in ['.bin', '.img', '.zip', '.gz']),

len(response.content) > 1024 * 1024, # 大于1MB的文件

]

return any(conditions)

def _save_firmware(self, flow: http.HTTPFlow):

"""保存拦截到的固件"""

timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")

url_hash = hashlib.md5(flow.request.url.encode()).hexdigest()[:8]

# 尝试获取文件名

content_disposition = flow.response.headers.get('Content-Disposition', '')

if 'filename=' in content_disposition:

filename = content_disposition.split('filename=')[-1].strip('"\'')

else:

filename = f"firmware_{timestamp}_{url_hash}.bin"

save_path = self.output_dir / filename

# 保存固件文件

with open(save_path, 'wb') as f:

f.write(flow.response.content)

# 保存请求元数据

meta = {

'url': flow.request.url,

'method': flow.request.method,

'headers': dict(flow.request.headers),

'timestamp': timestamp,

'size': len(flow.response.content),

'md5': hashlib.md5(flow.response.content).hexdigest(),

'sha256': hashlib.sha256(flow.response.content).hexdigest()

}

meta_path = save_path.with_suffix('.json')

with open(meta_path, 'w', encoding='utf-8') as f:

json.dump(meta, f, indent=2, ensure_ascii=False)

print(f"[+] 固件已保存: {save_path}")

print(f" 大小: {len(flow.response.content) / 1024 / 1024:.2f} MB")

print(f" MD5: {meta['md5']}")

# 自动进行初步分析

self._quick_analysis(save_path)

def _quick_analysis(self, firmware_path: Path):

"""快速分析固件文件"""

print(f"\n[+] 开始快速分析...")

# 1. 文件类型识别

import subprocess

result = subprocess.run(['file', str(firmware_path)],

capture_output=True, text=True)

print(f" 文件类型: {result.stdout.strip()}")

# 2. 查看文件头

with open(firmware_path, 'rb') as f:

header = f.read(512)

hex_dump = ' '.join(f'{b:02x}' for b in header[:64])

print(f" 文件头(前64字节): {hex_dump}")

# 3. 搜索字符串

strings_result = subprocess.run(['strings', str(firmware_path)],

capture_output=True, text=True)

strings = strings_result.stdout.split('\n')

# 查找可能的关键词

keywords = ['root', 'admin', 'password', 'kernel', 'linux', 'vmlinux']

found = []

for keyword in keywords:

for string in strings:

if keyword.lower() in string.lower():

found.append(string[:50])

break

if found:

print(f" 发现关键词: {found[:5]}...")

# 使用mitmproxy运行

addons = [OTAFirmwareInterceptor()]

3.3 途径三:物理提取

当无法通过软件方式获取时,物理提取是最后的手段:

#!/bin/bash

# firmware_extraction.sh

# 物理提取固件的自动化脚本

set -e

# 颜色输出

RED='\033[0;31m'

GREEN='\033[0;32m'

YELLOW='\033[1;33m'

NC='\033[0m' # No Color

log_info() {

echo -e "{GREEN}\[INFO\]{NC} $1"

}

log_warn() {

echo -e "{YELLOW}\[WARN\]{NC} $1"

}

log_error() {

echo -e "{RED}\[ERROR\]{NC} $1"

}

# 检查必要工具

check_tools() {

local tools=("flashrom" "sdparm" "dd" "strings" "binwalk" "file")

local missing=()

for tool in "${tools[@]}"; do

if ! command -v $tool &> /dev/null; then

missing+=($tool)

fi

done

if [ ${#missing[@]} -ne 0 ]; then

log_error "缺少必要工具: ${missing[*]}"

exit 1

fi

}

# 识别芯片类型

identify_chip() {

log_info "正在识别Flash芯片..."

# 使用flashrom检测

if sudo flashrom --programmer linux_spi:dev=/dev/spidev0.0 -r /tmp/test.bin &> /tmp/chip_info.txt; then

local chip_info=$(grep -i "found" /tmp/chip_info.txt)

log_info "芯片信息: $chip_info"

echo "$chip_info"

else

log_warn "自动识别失败,尝试手动识别"

echo "unknown"

fi

}

# 通过编程器读取

read_with_flashrom() {

local chip=$1

local output_file=$2

log_info "使用flashrom读取芯片: $chip"

# 尝试不同的programmer

local programmers=("linux_spi:dev=/dev/spidev0.0" "ch341a_spi" "dediprog")

for programmer in "${programmers[@]}"; do

log_info "尝试使用 $programmer..."

if sudo flashrom --programmer programmer -c "chip" -r "$output_file" 2>/dev/null; then

log_info "读取成功: $output_file"

return 0

fi

done

log_error "所有编程器尝试失败"

return 1

}

# 通过调试接口读取

read_via_jtag() {

local output_file=$1

log_info "尝试通过JTAG读取..."

# 检查是否连接了JTAG调试器

if lsusb | grep -i "jtag\|openocd\|ftdi" &> /dev/null; then

log_info "检测到JTAG调试器"

# 这里需要根据具体设备编写OpenOCD脚本

cat > /tmp/jtag_read.cfg << EOF

OpenOCD配置示例

source [find interface/jlink.cfg]

transport select jtag

source [find target/stm32f1x.cfg]

init

dump_image $output_file 0x08000000 0x10000

shutdown

EOF

if openocd -f /tmp/jtag_read.cfg &> /tmp/openocd.log; then

log_info "JTAG读取成功"

return 0

fi

fi

return 1

}

# 通过UART读取

read_via_uart() {

local device=$1

local baudrate=$2

local output_file=$3

log_info "尝试通过UART(device @ baudrate)交互获取..."

# 使用expect脚本自动化UART交互

cat > /tmp/uart_read.exp << EOF

#!/usr/bin/expect -f

set timeout 10

spawn screen device baudrate

expect {

"login:" { send "root\\r" }

"Password:" { send "admin\\r" }

"# " { send "cat /proc/mtd\\r" }

timeout { exit 1 }

}

expect "# "

send "dd if=/dev/mtd0 of=/tmp/firmware.bin\\r"

expect "# "

send "exit\\r"

expect eof

EOF

if expect /tmp/uart_read.exp &> /tmp/uart.log; then

# 实际中需要通过其他方式获取文件

log_info "UART交互完成,请手动提取文件"

return 0

fi

return 1

}

# 分析提取的固件

analyze_firmware() {

local firmware_file=$1

log_info "\n===== 固件分析报告 ====="

# 基础信息

log_info "文件信息:"

file "$firmware_file"

# 大小

local size=(stat -c%s "firmware_file")

log_info "文件大小: (numfmt --to=iec size)"

# binwalk 分析

log_info "\nBinwalk分析:"

binwalk "$firmware_file" | head -20

# 搜索字符串

log_info "\n发现的字符串(关键词):"

strings "$firmware_file" | grep -iE "version|v[0-9]\.[0-9]|copyright|license" | head -10

# 计算哈希

log_info "\n文件哈希:"

md5sum "$firmware_file"

sha256sum "$firmware_file"

}

main() {

check_tools

local chip_type=""

local output_file="firmware_dump_$(date +%Y%m%d_%H%M%S).bin"

echo "选择提取方式:"

echo "1) Flash编程器"

echo "2) JTAG调试接口"

echo "3) UART串口"

echo "4) 自动尝试所有方式"

read -p "选择[1-4]: " method

case $method in

chip_type=$(identify_chip)

read_with_flashrom "chip_type" "output_file"

;;

read_via_jtag "$output_file"

;;

read -p "输入串口设备(如/dev/ttyUSB0): " uart_device

read -p "输入波特率(如115200): " uart_baud

read_via_uart "uart_device" "uart_baud" "$output_file"

;;

log_info "尝试所有提取方式..."

# 按顺序尝试

;;

*)

log_error "无效选择"

exit 1

;;

esac

if [ -f "output_file" \] \&\& \[ -s "output_file" ]; then

analyze_firmware "$output_file"

log_info "固件已保存到: $output_file"

else

log_error "固件提取失败"

fi

}

main "$@"

四. Binwalk深度使用:从基础到高级

4.1 基础扫描与提取

#!/bin/bash

# binwalk_analysis.sh

FIRMWARE_FILE="$1"

OUTPUT_DIR="${FIRMWARE_FILE%.*}_extracted"

echo "=== 固件分析: (basename FIRMWARE_FILE) ==="

# 1. 基础信息扫描

echo "[1/6] 基础信息扫描..."

binwalk -I "$FIRMWARE_FILE"

# 2. 签名扫描(识别文件类型)

echo -e "\n[2/6] 签名扫描..."

binwalk -A "$FIRMWARE_FILE" | head -20

# 3. 熵分析(检测加密/压缩)

echo -e "\n[3/6] 熵分析..."

binwalk -E "$FIRMWARE_FILE"

# 4. 递归提取(主要操作)

echo -e "\n[4/6] 递归提取文件系统..."

binwalk -Me "FIRMWARE_FILE" -C "OUTPUT_DIR"

# 5. 提取特定类型文件

echo -e "\n[5/6] 提取特定类型文件..."

binwalk -D 'squashfs:unsquashfs -d %e.squashfs %e' \

-D 'gzip:gunzip %e' \

-D 'lzma:lzma -d %e' \

"$FIRMWARE_FILE" \

-C "$OUTPUT_DIR/extracted_types"

# 6. 生成分析报告

echo -e "\n[6/6] 生成分析报告..."

{

echo "# 固件分析报告"

echo "## 基本信息"

echo "- 文件名: (basename FIRMWARE_FILE)"

echo "- 大小: (du -h FIRMWARE_FILE | cut -f1)"

echo "- 分析时间: $(date)"

echo ""

echo "## 文件结构"

binwalk "$FIRMWARE_FILE"

echo ""

echo "## 提取的文件"

find "$OUTPUT_DIR" -type f | head -30

} > "$OUTPUT_DIR/analysis_report.md"

echo -e "\n=== 分析完成 ==="

echo "输出目录: $OUTPUT_DIR"

echo "报告文件: $OUTPUT_DIR/analysis_report.md"

4.2 高级技巧:自定义签名与自动化

创建自定义签名文件 custom.sig:

# custom.sig - 自定义固件签名

# 格式: 文件类型 | 扩展名 | 偏移 | 十六进制签名

# 自定义文件头签名

# 示例:识别特定厂商的固件格式

# 1. TP-Link固件签名(示例)

# 通常以 TP-LINK 开头

0 tp-link-header .tlh 0 string TP-LINK

# 2. 特定加密格式识别

# 某些设备使用自定义加密,识别其魔数

0 custom-encrypted .enc 0 hex {89 43 52 59 50 54}

# 3. U-Boot镜像识别

# U-Boot通常有特定的头部结构

0 u-boot-image .uboot 0 string U-Boot

# 4. 文件系统偏移识别

# 当文件系统不在开头时

1024 squashfs-lzma .sfs 0 string sqsh

# 5. 压缩格式识别

# 特定压缩算法

0 lzma-custom .clz 0 hex {5D 00 00 80 00}

# 6. 版本信息识别

# 在特定偏移寻找版本字符串

512 version-string .ver 0 string Version:

使用自定义签名:

# 使用自定义签名文件

binwalk -f custom.sig firmware.bin

# 结合内置签名

binwalk -S custom.sig -M firmware.bin

4.3 Python API自动化分析

#!/usr/bin/env python3

"""

Binwalk Python API自动化分析框架

"""

import binwalk

import os

import json

import hashlib

from pathlib import Path

from typing import Dict, List, Any

import logging

class AdvancedFirmwareAnalyzer:

"""高级固件分析器"""

def init(self, firmware_path: str):

self.firmware_path = Path(firmware_path)

self.output_dir = self.firmware_path.parent / f"{self.firmware_path.stem}_analysis"

self.output_dir.mkdir(exist_ok=True)

# 设置日志

logging.basicConfig(

level=logging.INFO,

format='%(asctime)s - %(levelname)s - %(message)s',

handlers=[

logging.FileHandler(self.output_dir / 'analysis.log'),

logging.StreamHandler()

]

)

self.logger = logging.getLogger(name)

def comprehensive_scan(self) -> Dict[str, Any]:

"""执行全面扫描"""

self.logger.info(f"开始分析固件: {self.firmware_path.name}")

results = {

'file_info': self._get_file_info(),

'entropy_analysis': self._entropy_analysis(),

'signature_scan': self._signature_scan(),

'strings_analysis': self._strings_analysis(),

'extraction_results': self._extract_files()

}

# 保存结果

report_path = self.output_dir / 'comprehensive_report.json'

with open(report_path, 'w', encoding='utf-8') as f:

json.dump(results, f, indent=2, ensure_ascii=False)

self.logger.info(f"分析完成,报告已保存: {report_path}")

return results

def _get_file_info(self) -> Dict[str, Any]:

"""获取文件基本信息"""

import magic

import subprocess

info = {}

info['path'] = str(self.firmware_path)

info['size'] = os.path.getsize(self.firmware_path)

info['md5'] = self._calculate_hash('md5')

info['sha256'] = self._calculate_hash('sha256')

# 文件类型

try:

mime = magic.Magic(mime=True)

info['mime_type'] = mime.from_file(str(self.firmware_path))

file_type = magic.Magic()

info['file_type'] = file_type.from_file(str(self.firmware_path))

except:

info['file_type'] = subprocess.run(

'file', str(self.firmware_path)\], capture_output=True, text=True ).stdout.strip() return info def _calculate_hash(self, algorithm: str) -\> str: """计算文件哈希""" hash_func = getattr(hashlib, algorithm)() with open(self.firmware_path, 'rb') as f: for chunk in iter(lambda: f.read(4096), b''): hash_func.update(chunk) return hash_func.hexdigest() def _entropy_analysis(self) -\> Dict\[str, Any\]: """熵分析(检测加密/压缩)""" import math with open(self.firmware_path, 'rb') as f: data = f.read() if not data: return {'error': '文件为空'} *#* *计算字节频率* byte_count = {} total_bytes = len(data) for byte in data: byte_count\[byte\] = byte_count.get(byte, 0) + 1 *#* *计算熵值* entropy = 0.0 for count in byte_count.values(): probability = count / total_bytes if probability \> 0: entropy -= probability \* math.log2(probability) *#* *分析结果* analysis = { 'entropy': entropy, 'max_entropy': 8.0, *#* *完全随机数据的熵* 'entropy_percentage': (entropy / 8.0) \* 100, 'interpretation': '' } *#* *根据熵值给出解释* if entropy \< 1.0: analysis\['interpretation'\] = '低熵 - 可能是未压缩的文本或重复数据' elif entropy \< 6.0: analysis\['interpretation'\] = '中等熵 - 可能是压缩数据或代码' else: analysis\['interpretation'\] = '高熵 - 可能是加密数据或高度压缩' *#* *生成熵分布图数据* chunk_size = 1024 entropy_points = \[

for i in range(0, len(data), chunk_size):

chunk = data[i:i + chunk_size]

if len(chunk) < chunk_size:

break

chunk_entropy = 0.0

chunk_count = {}

for byte in chunk:

chunk_count[byte] = chunk_count.get(byte, 0) + 1

for count in chunk_count.values():

prob = count / chunk_size

if prob > 0:

chunk_entropy -= prob * math.log2(prob)

entropy_points.append({

'offset': i,

'entropy': chunk_entropy

})

analysis['entropy_distribution'] = entropy_points[:100] # 只取前100个点

return analysis

def _signature_scan(self) -> List[Dict[str, Any]]:

"""签名扫描"""

self.logger.info("执行签名扫描...")

try:

# 使用binwalk模块进行扫描

module = binwalk.Module(

quiet=False,

verbose=True,

signature=True,

extract=False,

directory=str(self.output_dir / 'extracted')

)

# 扫描文件

module.scan(str(self.firmware_path))

results = []

for result in module.results:

results.append({

'offset': result.offset,

'description': result.description,

'file_type': getattr(result, 'file_type', 'unknown'),

'compression': getattr(result, 'compression', False),

'encrypted': getattr(result, 'encrypted', False)

})

return results

except Exception as e:

self.logger.error(f"签名扫描失败: {e}")

return []

def _strings_analysis(self) -> Dict[str, List[str]]:

"""字符串分析"""

import subprocess

self.logger.info("分析字符串...")

strings_result = {

'potential_credentials': [],

'urls': [],

'ip_addresses': [],

'file_paths': [],

'interesting_strings': []

}

# 提取字符串

try:

result = subprocess.run(

'strings', '-n', '4', str(self.firmware_path)\], capture_output=True, text=True, timeout=60 ) all_strings = result.stdout.split('\\n') import re *#* *搜索密码和凭证* password_patterns = \[ r'password\\s\*\[:=\]\\s\*(\[\^\\s;\]+)', r'passwd\\s\*\[:=\]\\s\*(\[\^\\s;\]+)', r'admin\\s\*\[:=\]\\s\*(\[\^\\s;\]+)', r'root\\s\*\[:=\]\\s\*(\[\^\\s;\]+)', r'\[A-Za-z0-9+/\]{20,}={0,2}', *# Base64* *编码的可能凭证*

for pattern in password_patterns:

for string in all_strings:

matches = re.findall(pattern, string, re.IGNORECASE)

strings_result['potential_credentials'].extend(matches)

# 搜索URL

url_pattern = r'https?://[^\s<>"\'{}|\\^`\[\]]+'

for string in all_strings:

urls = re.findall(url_pattern, string)

strings_result['urls'].extend(urls)

# 搜索IP地址

ip_pattern = r'\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b'

for string in all_strings:

ips = re.findall(ip_pattern, string)

strings_result['ip_addresses'].extend(ips)

# 搜索文件路径

path_patterns = [

r'/etc/[^\s<>"\'{}|\\^`\[\]]+',

r'/bin/[^\s<>"\'{}|\\^`\[\]]+',

r'/sbin/[^\s<>"\'{}|\\^`\[\]]+',

r'/usr/[^\s<>"\'{}|\\^`\[\]]+',

]

for pattern in path_patterns:

for string in all_strings:

paths = re.findall(pattern, string)

strings_result['file_paths'].extend(paths)

# 其他有趣字符串(关键字)

keywords = [

'backdoor', 'debug', 'test', 'secret',

'key', 'token', 'auth', 'login',

'kernel', 'init', 'startup',

'vulnerability', 'exploit',

'CVE-', '0day', 'hack'

]

for keyword in keywords:

for string in all_strings:

if keyword.lower() in string.lower():

strings_result['interesting_strings'].append(string)

break

# 去重

for key in strings_result:

strings_result[key] = list(set(strings_result[key]))[:50] # 每个类别最多50条

except subprocess.TimeoutExpired:

self.logger.warning("字符串提取超时")

except Exception as e:

self.logger.error(f"字符串分析错误: {e}")

return strings_result

def _extract_files(self) -> Dict[str, Any]:

"""提取文件"""

self.logger.info("提取文件系统...")

extract_dir = self.output_dir / 'extracted'

extract_dir.mkdir(exist_ok=True)

try:

# 使用binwalk提取

module = binwalk.Module(

quiet=False,

verbose=True,

signature=False,

extract=True,

directory=str(extract_dir),

matryoshka=True, # 递归提取

rm=False # 不删除临时文件

)

module.scan(str(self.firmware_path))

# 收集提取结果

extracted_files = []

for root, dirs, files in os.walk(extract_dir):

for file in files:

filepath = os.path.join(root, file)

rel_path = os.path.relpath(filepath, extract_dir)

try:

file_type = subprocess.run(

'file', '-b', filepath\], capture_output=True, text=True ).stdout.strip() extracted_files.append({ 'path': rel_path, 'size': os.path.getsize(filepath), 'type': file_type, 'md5': self._calculate_file_hash(filepath, 'md5') }) except: continue return { 'extract_dir': str(extract_dir), 'file_count': len(extracted_files), 'files': extracted_files\[:100\], *#* *只显示前100个文件* 'total_size': sum(f\['size'\] for f in extracted_files) } except Exception as e: self.logger.error(f"文件提取失败: {e}") return {'error': str(e)} def _calculate_file_hash(self, filepath: str, algorithm: str) -\> str: """计算单个文件的哈希""" hash_func = getattr(hashlib, algorithm)() try: with open(filepath, 'rb') as f: for chunk in iter(lambda: f.read(4096), b''): hash_func.update(chunk) return hash_func.hexdigest() except: return 'error' def generate_html_report(self): """生成HTML格式的详细报告""" import html data = self.comprehensive_scan() html_content = f''' \ \ \ \ \ \固件分析报告 - {html.escape(self.firmware_path.name)}\ \ body {{ font-family: Arial, sans-serif; margin: 20px; line-height: 1.6; }} .container {{ max-width: 1200px; margin: 0 auto; }} .header {{ background: #f5f5f5; padding: 20px; border-radius: 5px; margin-bottom: 20px; }} .section {{ margin-bottom: 30px; padding: 20px; border: 1px solid #ddd; border-radius: 5px; }} .section-title {{ color: #333; border-bottom: 2px solid #007bff; padding-bottom: 10px; }} .danger {{ color: #dc3545; font-weight: bold; }} .warning {{ color: #ffc107; font-weight: bold; }} .success {{ color: #28a745; font-weight: bold; }} pre {{ background: #f8f9fa; padding: 10px; border-radius: 3px; overflow: auto; }} table {{ width: 100%; border-collapse: collapse; margin: 10px 0; }} th, td {{ border: 1px solid #ddd; padding: 8px; text-align: left; }} th {{ background: #f2f2f2; }} tr:nth-child(even) {{ background: #f9f9f9; }} \ \ \ \

\
\固件安全分析报告\ \\文件名:\ {html.escape(data\['file_info'\].get('path', 'N/A'))}\ \\分析时间:\ {html.escape(str(data.get('analysis_time', 'N/A')))}\ \ \
\

文件基本信息\ \ \\项目\\值\\ \\文件大小\\{data\['file_info'\].get('size', 'N/A')} 字节\\ \\MD5哈希\\\{data\['file_info'\].get('md5', 'N/A')}\\\ \\SHA256哈希\\\{data\['file_info'\].get('sha256', 'N/A')}\\\ \\文件类型\\{html.escape(data\['file_info'\].get('file_type', 'N/A'))}\\ \ \ \
\

熵分析\ \\熵值:\ {data\['entropy_analysis'\].get('entropy', 'N/A'):.4f} / 8.0\ \\百分比:\ {data\['entropy_analysis'\].get('entropy_percentage', 'N/A'):.1f}%\ \\解释:\ {html.escape(data\['entropy_analysis'\].get('interpretation', 'N/A'))}\ \熵分布图\ \
\ \ // 这里可以添加Chart.js代码来绘制熵分布图 // 由于篇幅限制,实际实现时需要引入Chart.js库 \ \ \
\

签名扫描结果\ \ \ \偏移\ \描述\ \文件类型\ \状态\ \ ''' for sig in data\['signature_scan'\]\[:20\]: *#* *只显示前20条* html_content += f''' \ \0x{sig.get('offset', 0):X}\ \{html.escape(sig.get('description', 'N/A'))}\ \{html.escape(sig.get('file_type', 'N/A'))}\ \ {'\加密\' if sig.get('encrypted') else '\正常\' if not sig.get('compressed') else '\压缩\'} \ \ ''' html_content += ''' \ \ \
\

字符串分析\ ''' for category, items in data\['strings_analysis'\].items(): if items: html_content += f''' \{category.replace('_', ' ').title()}\ \ ''' for item in items\[:10\]: *#* *每个类别只显示前10条* escaped_item = html.escape(str(item)) *#* *检查是否为潜在危险凭证* if category == 'potential_credentials': html_content += f'\
  • {escaped_item}\' else: html_content += f'\{escaped_item}\' html_content += '\' html_content += ''' \ \
    \

    提取的文件\ \\提取目录:\ {}\ \\文件总数:\ {}\ \\总大小:\ {:.2f} MB\ \部分文件列表\ \ \ \路径\ \大小\ \类型\ \MD5\ \ '''.format( html.escape(data\['extraction_results'\].get('extract_dir', 'N/A')), data\['extraction_results'\].get('file_count', 0), data\['extraction_results'\].get('total_size', 0) / 1024 / 1024 ) for file_info in data\['extraction_results'\].get('files', \[\])\[:15\]: html_content += f''' \ \{html.escape(file_info.get('path', 'N/A'))}\ \{file_info.get('size', 0)} 字节\ \{html.escape(file_info.get('type', 'N/A'))}\ \\{file_info.get('md5', 'N/A')}\\ \ ''' html_content += ''' \ \ \
    \

    安全建议\ \ \检查所有发现的潜在凭证,确认是否存在硬编码密码\ \分析提取的可执行文件中的漏洞\ \验证固件的签名和完整性校验机制\ \审查网络服务和开放端口配置\ \ \ \
    \

    免责声明\ \本报告仅用于安全研究和教育目的。在进行任何安全测试前,请确保您已获得适当的授权。\ \ \ \ \ ''' report_path = self.output_dir / 'report.html' with open(report_path, 'w', encoding='utf-8') as f: f.write(html_content) self.logger.info(f"HTML报告已生成: {report_path}") return str(report_path) *# 使用示例* if __name__ == "__main__": analyzer = AdvancedFirmwareAnalyzer("firmware.bin") *#* *生成完整分析报告* results = analyzer.comprehensive_scan() *#* *生成HTML报告* html_report = analyzer.generate_html_report() print(f"\\n分析完成!") print(f"- JSON报告: {analyzer.output_dir}/comprehensive_report.json") print(f"- HTML报告: {html_report}") print(f"- 日志文件: {analyzer.output_dir}/analysis.log") ### 五. 敏感信息挖掘:自动化搜索与分析 **5.1 自动化敏感信息搜索系统** *#!/usr/bin/env python3* """ 固件敏感信息自动化搜索系统 支持多种类型的敏感信息检测 """ import os import re import json import hashlib import logging from pathlib import Path from typing import Dict, List, Set, Tuple, Optional import mimetypes import subprocess class SensitiveInfoHunter: """敏感信息猎手""" *#* *预定义的正则表达式模式* PATTERNS = { *#* *硬编码凭证* 'hardcoded_passwords': \[ r'(?i)(?:password\|passwd\|pwd)\[=:\\s\]+\[\\'"\]?(\[\^\\'"\\s\]{4,})', r'(?i)admin\\s\*\[=:\]\\s\*\[\\'"\]?(\[\^\\'"\\s\]{3,})', r'(?i)root\\s\*\[=:\]\\s\*\[\\'"\]?(\[\^\\'"\\s\]{3,})', r'(?i)(?:user\|username)\\s\*\[=:\]\\s\*\[\\'"\]?(\[\^\\'"\\s\]{3,})', \], *# API* *密钥和令牌* 'api_keys': \[ *# AWS* *密钥* r'(?:AKIA\|ASIA)\[A-Z0-9\]{16}', *# Google API* *密钥* r'AIza\[0-9A-Za-z\\-_\]{35}', *# GitHub* *令牌* r'ghp_\[a-zA-Z0-9\]{36}', *# JWT* *令牌* r'eyJ\[a-zA-Z0-9\]{10,}\\.\[a-zA-Z0-9\]{10,}\\.\[a-zA-Z0-9_\\-\]{10,}', \], *#* *加密密钥* 'crypto_keys': \[ *# RSA* *私钥* r'-----BEGIN (?:RSA\|EC\|DSA) PRIVATE KEY-----', *# SSH* *私钥* r'-----BEGIN OPENSSH PRIVATE KEY-----', *# PGP* *私钥* r'-----BEGIN PGP PRIVATE KEY BLOCK-----', *#* *通用私钥* r'-----BEGIN PRIVATE KEY-----', \], *#* *数据库连接字符串* 'database_connections': \[ *# MySQL* *连接* r'mysql://\[\^:\\s\]+:\[\^@\\s\]+@\[\^\\s/\]+/\[\^\\s?\]+', *# PostgreSQL* *连接* r'postgres(?:ql)?://\[\^:\\s\]+:\[\^@\\s\]+@\[\^\\s/\]+/\[\^\\s?\]+', *# MongoDB* *连接* r'mongodb(?:+srv)?://\[\^:\\s\]+:\[\^@\\s\]+@\[\^\\s/\]+', \], *#* *配置文件敏感信息* 'config_secrets': \[ *#* *通用配置格式* r'(?i)(?:secret\|key\|token\|credential)\[\\s=:\]+\[\\'"\]?(\[\^\\'"\\s\]{8,})', *# JSON* *格式* r'\["\\'\]?(?:secret\|key\|token)\["\\'\]?\\s\*:\\s\*\["\\'\](\[\^\\'"\]{8,})\["\\'\]', *# XML* *格式* r'\<(?:secret\|key\|token)\[\^\>\]\*\>(\[\^\<\]{8,})\', \], *#* *调试信息和后门* 'debug_backdoors': \[ *#* *调试端口* r'debug\\s\*\[=:\]\\s\*\[\\'"\]?(?:true\|yes\|on\|1)\[\\'"\]?', *#* *后门命令* r'(?i)(?:backdoor\|debug\|test)_?(?:mode\|shell\|cmd)', *#* *硬编码的调试凭证* r'(?:debug\|test)\[=:\]\\s\*\[\\'"\]?(?:admin\|root)\[\\'"\]?', \], } *#* *敏感文件扩展名* SENSITIVE_EXTENSIONS = { '.pem', '.key', '.crt', '.pfx', '.p12', *#* *证书和密钥* '.db', '.sqlite', '.sqlite3', '.db3', *#* *数据库文件* '.env', '.config', '.cfg', '.conf', *#* *配置文件* '.sh', '.bash', '.zsh', '.fish', *# Shell* *脚本* '.py', '.php', '.js', '.java', '.c', *#* *源代码文件* '.xml', '.json', '.yml', '.yaml', *#* *配置文件* '.log', '.txt', '.md', *#* *文本文件* '.history', '.bash_history', '.zsh_history', *#* *历史文件* } *#* *敏感路径关键词* SENSITIVE_PATHS = { '/etc/passwd', '/etc/shadow', '/etc/sudoers', '/root/', '/home/', '/var/log/', '/tmp/', '.ssh/', '.aws/', '.config/', '.local/', 'wp-config.php', 'config.php', 'settings.py', } def __init__(self, search_path: str, output_dir: str = "./reports"): self.search_path = Path(search_path) self.output_dir = Path(output_dir) self.output_dir.mkdir(exist_ok=True) *#* *结果存储* self.findings = { category: \[\] for category in self.PATTERNS.keys() } self.findings\['sensitive_files'\] = \[

    self.findings['statistics'] = {}

    # 设置日志

    self._setup_logging()

    def _setup_logging(self):

    """设置日志系统"""

    log_file = self.output_dir / 'hunter.log'

    logging.basicConfig(

    level=logging.INFO,

    format='%(asctime)s - %(levelname)s - %(message)s',

    handlers=[

    logging.FileHandler(log_file),

    logging.StreamHandler()

    ]

    )

    self.logger = logging.getLogger(name)

    def scan(self) -> Dict:

    """执行完整扫描"""

    self.logger.info(f"开始扫描: {self.search_path}")

    # 1. 扫描文件

    self._scan_files()

    # 2. 扫描二进制文件中的字符串

    self._scan_binaries()

    # 3. 分析配置文件

    self._analyze_configs()

    # 4. 生成报告

    report = self._generate_report()

    self.logger.info(f"扫描完成,发现 {self._count_findings()} 个敏感信息")

    return report

    def _scan_files(self):

    """扫描文件系统中的文件"""

    self.logger.info("扫描文件系统...")

    for root, dirs, files in os.walk(self.search_path):

    # 跳过一些目录

    dirs[:] = [d for d in dirs if not d.startswith('.') and d not in {'tmp', 'temp'}]

    for file in files:

    filepath = Path(root) / file

    rel_path = str(filepath.relative_to(self.search_path))

    # 检查是否敏感文件

    if self._is_sensitive_file(filepath):

    self.findings['sensitive_files'].append({

    'path': rel_path,

    'reason': '敏感扩展名',

    'size': filepath.stat().st_size

    })

    # 检查路径是否敏感

    for sensitive_path in self.SENSITIVE_PATHS:

    if sensitive_path in rel_path:

    self.findings['sensitive_files'].append({

    'path': rel_path,

    'reason': f'敏感路径包含: {sensitive_path}',

    'size': filepath.stat().st_size

    })

    break

    # 扫描文件内容

    self._scan_file_content(filepath, rel_path)

    def _is_sensitive_file(self, filepath: Path) -> bool:

    """判断是否为敏感文件"""

    # 检查扩展名

    if filepath.suffix.lower() in self.SENSITIVE_EXTENSIONS:

    return True

    # 检查文件名

    sensitive_names = {

    'passwd', 'shadow', 'sudoers', 'hosts',

    'authorized_keys', 'id_rsa', 'id_dsa',

    'credentials', 'config', 'settings',

    '.env', '.gitconfig', '.npmrc',

    }

    if filepath.name in sensitive_names or filepath.name.startswith('.'):

    return True

    return False

    def _scan_file_content(self, filepath: Path, rel_path: str):

    """扫描文件内容"""

    try:

    # 检查文件类型

    mime_type = mimetypes.guess_type(str(filepath))[0]

    # 只处理文本文件和常见的配置文件

    if mime_type and ('text/' in mime_type or

    mime_type in {'application/json',

    'application/xml',

    'application/yaml'}):

    content = filepath.read_text(encoding='utf-8', errors='ignore')

    self._analyze_text_content(content, rel_path, 'text')

    # 处理二进制文件(使用strings)

    elif filepath.stat().st_size < 10 * 1024 * 1024: # 小于10MB

    try:

    result = subprocess.run(

    'strings', str(filepath)\], capture_output=True, text=True, timeout=10 ) if result.returncode == 0: self._analyze_text_content(result.stdout, rel_path, 'binary') except: pass except Exception as e: self.logger.debug(f"扫描文件失败 {rel_path}: {e}") def _scan_binaries(self): """专门扫描二进制文件""" self.logger.info("深度扫描二进制文件...") binary_extensions = {'.bin', '.elf', '.so', '.dll', '.exe'} for root, dirs, files in os.walk(self.search_path): for file in files: if Path(file).suffix.lower() in binary_extensions: filepath = Path(root) / file self._deep_scan_binary(filepath) def _deep_scan_binary(self, filepath: Path): """深度扫描二进制文件""" try: *#* *使用更高级的strings参数* result = subprocess.run( \['strings', '-n', '6', '-t', 'x', str(filepath)\], capture_output=True, text=True, timeout=30 ) if result.returncode == 0: content = result.stdout rel_path = str(filepath.relative_to(self.search_path)) *#* *分析二进制字符串* self._analyze_text_content(content, rel_path, 'binary_deep') *#* *额外的二进制分析* self._analyze_binary_patterns(filepath, rel_path) except subprocess.TimeoutExpired: self.logger.warning(f"二进制扫描超时: {filepath}") except Exception as e: self.logger.debug(f"二进制扫描失败 {filepath}: {e}") def _analyze_binary_patterns(self, filepath: Path, rel_path: str): """分析二进制文件的特定模式""" try: *#* *读取文件头* with open(filepath, 'rb') as f: header = f.read(512) *#* *检查常见漏洞模式* *# 1.* *硬编码IP地址* ip_pattern = rb'\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}' ips = re.findall(ip_pattern, header) for ip in ips\[:5\]: *#* *最多5个* self.findings\['debug_backdoors'\].append({ 'type': '硬编码IP', 'value': ip.decode('ascii', errors='ignore'), 'source': rel_path, 'context': '文件头部' }) *# 2.* *检查调试符号* debug_strings = \[b'debug', b'test', b'backdoor', b'password'

    for debug_str in debug_strings:

    if debug_str in header.lower():

    self.findings['debug_backdoors'].append({

    'type': '调试字符串',

    'value': debug_str.decode('ascii'),

    'source': rel_path,

    'context': '文件头部'

    })

    except Exception as e:

    self.logger.debug(f"二进制模式分析失败: {e}")

    def _analyze_configs(self):

    """分析配置文件"""

    config_files = []

    for finding in self.findings['sensitive_files']:

    if 'config' in finding['path'].lower() or finding['path'].endswith('.conf'):

    config_files.append(finding['path'])

    self.logger.info(f"发现 {len(config_files)} 个配置文件,进行深度分析...")

    for config_path in config_files[:20]: # 最多分析20个

    try:

    full_path = self.search_path / config_path

    content = full_path.read_text(encoding='utf-8', errors='ignore')

    # 分析特定格式的配置文件

    if config_path.endswith('.json'):

    self._analyze_json_config(content, config_path)

    elif config_path.endswith('.yml') or config_path.endswith('.yaml'):

    self._analyze_yaml_config(content, config_path)

    elif config_path.endswith('.xml'):

    self._analyze_xml_config(content, config_path)

    except Exception as e:

    self.logger.debug(f"配置文件分析失败 {config_path}: {e}")

    def _analyze_json_config(self, content: str, source: str):

    """分析JSON配置文件"""

    try:

    import json as json_module

    config = json_module.loads(content)

    # 递归搜索敏感键值

    self._search_json_recursive(config, source, [])

    except:

    # 如果不是有效的JSON,使用正则表达式搜索

    self._analyze_text_content(content, source, 'config')

    def _search_json_recursive(self, data, source: str, path: List[str]):

    """递归搜索JSON中的敏感信息"""

    if isinstance(data, dict):

    for key, value in data.items():

    current_path = path + [key]

    # 检查键名

    key_str = '.'.join(current_path)

    if any(sensitive in key_str.lower()

    for sensitive in ['pass', 'secret', 'key', 'token']):

    if value and isinstance(value, str) and len(value) > 4:

    self.findings['config_secrets'].append({

    'type': 'JSON配置',

    'key': key_str,

    'value': value[:50] + ('...' if len(value) > 50 else ''),

    'source': source

    })

    # 递归搜索

    if isinstance(value, (dict, list)):

    self._search_json_recursive(value, source, current_path)

    elif isinstance(data, list):

    for i, item in enumerate(data):

    self._search_json_recursive(item, source, path + [str(i)])

    def _analyze_yaml_config(self, content: str, source: str):

    """分析YAML配置文件"""

    try:

    import yaml

    config = yaml.safe_load(content)

    self._search_json_recursive(config, source, [])

    except:

    self._analyze_text_content(content, source, 'config')

    def _analyze_xml_config(self, content: str, source: str):

    """分析XML配置文件"""

    # 使用正则表达式搜索XML中的敏感信息

    patterns = [

    r'<(?:password|passwd|secret|key)[^>]*>([^<]+)</(?:password|passwd|secret|key)>',

    r'(?:password|secret|key)="([^"]+)"',

    ]

    for pattern in patterns:

    matches = re.findall(pattern, content, re.IGNORECASE)

    for match in matches:

    if len(match) > 4:

    self.findings['config_secrets'].append({

    'type': 'XML配置',

    'value': match[:50] + ('...' if len(match) > 50 else ''),

    'source': source

    })

    def _analyze_text_content(self, content: str, source: str, file_type: str):

    """分析文本内容"""

    for category, patterns in self.PATTERNS.items():

    for pattern in patterns:

    try:

    matches = re.findall(pattern, content, re.MULTILINE | re.IGNORECASE)

    for match in matches:

    # 处理不同的匹配格式

    if isinstance(match, tuple):

    # 如果模式有分组,取第一个分组

    value = match[0] if match[0] else ''.join(match[1:])

    else:

    value = match

    # 跳过空值或太短的值

    if not value or len(str(value).strip()) < 4:

    continue

    # 添加上下文信息

    context = self._get_context(content, str(value))

    self.findings[category].append({

    'value': str(value)[:100], # 限制长度

    'source': source,

    'type': file_type,

    'context': context,

    'pattern': pattern[:50]

    })

    except Exception as e:

    self.logger.debug(f"模式匹配失败 {pattern}: {e}")

    def _get_context(self, content: str, target: str, context_lines: int = 2) -> str:

    """获取匹配内容的上下文"""

    try:

    lines = content.split('\n')

    for i, line in enumerate(lines):

    if target in line:

    start = max(0, i - context_lines)

    end = min(len(lines), i + context_lines + 1)

    context = '\n'.join(lines[start:end])

    return context[:500] # 限制长度

    except:

    pass

    return target[:100]

    def _count_findings(self) -> int:

    """计算发现的总数"""

    total = 0

    for category, items in self.findings.items():

    if category != 'statistics':

    total += len(items)

    return total

    def _generate_report(self) -> Dict:

    """生成详细报告"""

    self.logger.info("生成报告...")

    # 统计信息

    stats = {

    'total_files_scanned': len(self.findings['sensitive_files']) + 100, # 估算值

    'total_findings': self._count_findings(),

    'by_category': {},

    'risk_assessment': {}

    }

    for category, items in self.findings.items():

    if category != 'statistics':

    stats['by_category'][category] = len(items)

    # 风险评估

    risk_score = 0

    if stats['by_category'].get('hardcoded_passwords', 0) > 0:

    risk_score += 10 * stats['by_category']['hardcoded_passwords']

    if stats['by_category'].get('crypto_keys', 0) > 0:

    risk_score += 8 * stats['by_category']['crypto_keys']

    if stats['by_category'].get('debug_backdoors', 0) > 0:

    risk_score += 7 * stats['by_category']['debug_backdoors']

    if risk_score > 50:

    risk_level = "严重"

    elif risk_score > 20:

    risk_level = "高危"

    elif risk_score > 10:

    risk_level = "中危"

    else:

    risk_level = "低危"

    stats['risk_assessment'] = {

    'score': risk_score,

    'level': risk_level,

    'recommendation': self._get_recommendation(risk_level)

    }

    self.findings['statistics'] = stats

    # 保存报告

    self._save_reports()

    return self.findings

    def _get_recommendation(self, risk_level: str) -> str:

    """根据风险等级给出建议"""

    recommendations = {

    "严重": "立即修复!发现硬编码密码和加密密钥,系统面临严重威胁。",

    "高危": "尽快修复!发现多个敏感信息泄露风险。",

    "中危": "建议修复!存在一些安全风险。",

    "低危": "风险较低,但建议进行代码审查。"

    }

    return recommendations.get(risk_level, "请进行详细的安全评估。")

    def _save_reports(self):

    """保存各种格式的报告"""

    # 1. JSON 报告

    json_path = self.output_dir / 'sensitive_info_report.json'

    with open(json_path, 'w', encoding='utf-8') as f:

    json.dump(self.findings, f, indent=2, ensure_ascii=False, default=str)

    # 2. Markdown 报告

    md_path = self.output_dir / 'report.md'

    self._generate_markdown_report(md_path)

    # 3. CSV 报告(方便导入其他工具)

    csv_path = self.output_dir / 'findings.csv'

    self._generate_csv_report(csv_path)

    self.logger.info(f"报告已保存到: {self.output_dir}")

    def _generate_markdown_report(self, path: Path):

    """生成Markdown格式报告"""

    md_content = f"""# 固件敏感信息扫描报告

    概览

    • **扫描路径**: `{self.search_path}`

    • **扫描时间**: {self.findings['statistics'].get('scan_time', 'N/A')}

    • **风险等级**: **{self.findings['statistics']['risk_assessment']['level']}**

    • **风险评分**: {self.findings['statistics']['risk_assessment']['score']}

    统计信息

    """

    # 统计表格

    md_content += "| 类别 | 发现数量 | 风险等级 |\n"

    md_content += "|------|----------|----------|\n"

    for category, count in self.findings['statistics']['by_category'].items():

    if count > 0:

    # 确定风险等级

    if category in ['hardcoded_passwords', 'crypto_keys']:

    risk = "🔥 高危"

    elif category in ['api_keys', 'debug_backdoors']:

    risk = "⚠️ 中危"

    else:

    risk = "ℹ️ 低危"

    md_content += f"| {category} | {count} | {risk} |\n"

    # 详细发现

    for category, items in self.findings.items():

    if category != 'statistics' and items:

    md_content += f"\n## {category.replace('_', ' ').title()}\n\n"

    for i, item in enumerate(items[:20], 1): # 每个类别最多显示20个

    md_content += f"### 发现 #{i}\n"

    md_content += f"- **值**: `{item.get('value', 'N/A')}`\n"

    md_content += f"- **来源**: `{item.get('source', 'N/A')}`\n"

    md_content += f"- **类型**: {item.get('type', 'N/A')}\n"

    if item.get('context'):

    md_content += f"- **上下文**:\n```\n{item['context']}\n```\n"

    md_content += "\n"

    # 建议

    md_content += f"""## 安全建议

    {self.findings['statistics']['risk_assessment']['recommendation']}

    具体建议:

    1. **立即行动**:
    • 移除所有硬编码的密码和密钥

    • 轮换所有泄露的API密钥和令牌

    • 禁用或移除调试后门

    1. **短期改进**:
    • 实施安全的密钥管理方案

    • 对配置文件进行加密

    • 添加访问控制和审计日志

    1. **长期规划**:
    • 建立安全开发流程(SDL)

    • 定期进行安全代码审查

    • 实施自动化安全测试

    免责声明

    本报告仅用于安全研究和教育目的。在进行任何安全测试前,请确保您已获得适当的授权。


    *报告生成时间: {self.findings['statistics'].get('scan_time', 'N/A')}*

    """

    with open(path, 'w', encoding='utf-8') as f:

    f.write(md_content)

    def _generate_csv_report(self, path: Path):

    """生成CSV格式报告"""

    import csv

    with open(path, 'w', newline='', encoding='utf-8') as f:

    writer = csv.writer(f)

    # 写入标题

    writer.writerow(['类别', '值', '来源', '类型', '上下文', '风险等级'])

    # 写入数据

    for category, items in self.findings.items():

    if category != 'statistics':

    for item in items:

    # 确定风险等级

    if category in ['hardcoded_passwords', 'crypto_keys']:

    risk = 'HIGH'

    elif category in ['api_keys', 'debug_backdoors']:

    risk = 'MEDIUM'

    else:

    risk = 'LOW'

    writer.writerow([

    category,

    item.get('value', ''),

    item.get('source', ''),

    item.get('type', ''),

    item.get('context', '').replace('\n', '\\n'),

    risk

    ])

    # 使用示例

    if name == "main":

    import sys

    if len(sys.argv) < 2:

    print("用法: python sensitive_info_hunter.py <固件路径> [输出目录]")

    sys.exit(1)

    firmware_path = sys.argv[1]

    output_dir = sys.argv[2] if len(sys.argv) > 2 else "./reports"

    if not os.path.exists(firmware_path):

    print(f"错误: 路径不存在: {firmware_path}")

    sys.exit(1)

    print(f"开始扫描: {firmware_path}")

    print(f"输出目录: {output_dir}")

    print("-" * 50)

    hunter = SensitiveInfoHunter(firmware_path, output_dir)

    report = hunter.scan()

    print(f"\n扫描完成!")

    print(f"发现敏感信息总数: {report['statistics']['total_findings']}")

    print(f"风险等级: {report['statistics']['risk_assessment']['level']}")

    print(f"报告文件:")

    print(f" - JSON: {output_dir}/sensitive_info_report.json")

    print(f" - Markdown: {output_dir}/report.md")

    print(f" - CSV: {output_dir}/findings.csv")

    print(f" - 日志: {output_dir}/hunter.log")

    5.2 固件敏感信息快速扫描脚本

    #!/bin/bash

    # firmware_secrets_scan.sh - 快速扫描固件中的敏感信息

    set -e

    # 颜色定义

    RED='\033[0;31m'

    GREEN='\033[0;32m'

    YELLOW='\033[1;33m'

    BLUE='\033[0;34m'

    NC='\033[0m' # No Color

    # 输出函数

    log_info() { echo -e "{BLUE}\[INFO\]{NC} $1"; }

    log_success() { echo -e "{GREEN}\[SUCCESS\]{NC} $1"; }

    log_warning() { echo -e "{YELLOW}\[WARNING\]{NC} $1"; }

    log_error() { echo -e "{RED}\[ERROR\]{NC} $1"; }

    # 检查依赖

    check_dependencies() {

    local deps=("binwalk" "strings" "grep" "find" "file")

    local missing=()

    for dep in "${deps[@]}"; do

    if ! command -v $dep &> /dev/null; then

    missing+=("$dep")

    fi

    done

    if [ ${#missing[@]} -ne 0 ]; then

    log_error "缺少依赖: ${missing[*]}"

    exit 1

    fi

    }

    # 快速扫描函数

    quick_scan() {

    local firmware="$1"

    local output_dir="{firmware%.\*}_secrets_(date +%Y%m%d_%H%M%S)"

    log_info "开始快速扫描: (basename firmware)"

    mkdir -p "$output_dir"

    # 1. 基本信息

    log_info "收集基本信息..."

    {

    echo "=== 固件基本信息 ==="

    echo "文件名: (basename firmware)"

    echo "大小: (du -h "firmware" | cut -f1)"

    echo "MD5: (md5sum "firmware" | cut -d' ' -f1)"

    echo "SHA256: (sha256sum "firmware" | cut -d' ' -f1)"

    echo "文件类型: (file -b "firmware")"

    echo ""

    } > "$output_dir/basic_info.txt"

    # 2. 直接字符串搜索

    log_info "搜索硬编码字符串..."

    {

    echo "=== 硬编码字符串搜索 ==="

    echo ""

    echo "1. 搜索密码相关字符串:"

    strings "$firmware" | grep -iE "password|passwd|pwd|admin|root|login" | head -20

    echo ""

    echo "2. 搜索URL和IP地址:"

    strings "$firmware" | grep -E "https?://|ftp://|[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}" | head -20

    echo ""

    echo "3. 搜索调试和测试字符串:"

    strings "$firmware" | grep -iE "debug|test|backdoor|shell|console" | head -20

    echo ""

    echo "4. 搜索版本信息:"

    strings "$firmware" | grep -iE "version|v[0-9]\.[0-9]|build|release" | head -20

    echo ""

    echo "5. 搜索可能的密钥:"

    strings "$firmware" | grep -E "[A-Za-z0-9+/]{20,}={0,2}" | head -20

    } > "$output_dir/strings_analysis.txt"

    # 3. 快速binwalk分析

    log_info "执行binwalk分析..."

    {

    echo "=== Binwalk快速分析 ==="

    echo ""

    binwalk -q "$firmware" | head -30

    } > "$output_dir/binwalk_analysis.txt"

    # 4. 提取文件并搜索配置文件

    log_info "提取文件系统并搜索配置文件..."

    extract_dir="$output_dir/extracted"

    mkdir -p "$extract_dir"

    if binwalk -eq -C "extract_dir" "firmware" &> /dev/null; then

    {

    echo "=== 配置文件搜索 ==="

    echo ""

    echo "1. 查找配置文件:"

    find "$extract_dir" -type f \( -name "*.conf" -o -name "*.cfg" -o -name "*.config" \) | head -20

    echo ""

    echo "2. 查找包含密码的文件:"

    find "$extract_dir" -type f -exec grep -l -i "password" {} \; | head -20

    echo ""

    echo "3. 查找证书和密钥文件:"

    find "$extract_dir" -type f \( -name "*.pem" -o -name "*.key" -o -name "*.crt" \) | head -20

    echo ""

    echo "4. 查找Shell脚本:"

    find "$extract_dir" -type f -name "*.sh" | head -20

    } > "$output_dir/config_files.txt"

    # 5. 检查特定的敏感文件

    log_info "检查已知敏感文件..."

    {

    echo "=== 已知敏感文件检查 ==="

    echo ""

    sensitive_files=(

    "/etc/passwd"

    "/etc/shadow"

    "/etc/sudoers"

    "/root/.bash_history"

    "/root/.ssh/id_rsa"

    ".env"

    "config.php"

    "wp-config.php"

    "settings.py"

    )

    for sensitive in "${sensitive_files[@]}"; do

    if find "extract_dir" -path "\*sensitive" -type f | grep -q .; then

    echo "⚠️ 发现: $sensitive"

    find "extract_dir" -path "\*sensitive" -type f

    echo ""

    fi

    done

    } > "$output_dir/sensitive_files.txt"

    else

    log_warning "文件提取失败,跳过文件系统分析"

    fi

    # 6. 生成汇总报告

    log_info "生成汇总报告..."

    {

    echo "# 固件敏感信息快速扫描报告"

    echo "## 扫描信息"

    echo "- 固件文件: (basename firmware)"

    echo "- 扫描时间: $(date)"

    echo "- 扫描模式: 快速扫描"

    echo ""

    echo "## 关键发现"

    echo ""

    # 检查是否有敏感发现

    sensitive_count=0

    # 检查字符串分析中的密码

    if grep -q "password\|admin\|root" "$output_dir/strings_analysis.txt"; then

    echo "🔴 **发现硬编码凭证**"

    grep -i "password\|admin\|root" "$output_dir/strings_analysis.txt" | head -5 | while read line; do

    echo " - $line"

    done

    ((sensitive_count++))

    echo ""

    fi

    # 检查配置文件

    if [ -f "output_dir/config_files.txt" \] \&\& \[ -s "output_dir/config_files.txt" ]; then

    config_count=(grep -c "\\.conf\\\|\\.cfg\\\|\\.config" "output_dir/config_files.txt" 2>/dev/null || echo 0)

    if [ "$config_count" -gt 0 ]; then

    echo "🟡 **发现配置文件** ($config_count 个)"

    grep "\.conf\|\.cfg\|\.config" "$output_dir/config_files.txt" | head -3 | while read line; do

    echo " - $line"

    done

    echo ""

    fi

    fi

    # 检查敏感文件

    if [ -f "output_dir/sensitive_files.txt" \] \&\& grep -q "⚠️" "output_dir/sensitive_files.txt"; then

    echo "🔴 **发现敏感文件**"

    grep "发现:" "$output_dir/sensitive_files.txt" | while read line; do

    echo " - $line"

    done

    ((sensitive_count+=2))

    echo ""

    fi

    echo "## 风险评估"

    if [ "$sensitive_count" -ge 3 ]; then

    echo "**风险等级: 🔴 高危**"

    echo "发现多个敏感信息,建议立即进行详细分析。"

    elif [ "$sensitive_count" -ge 1 ]; then

    echo "**风险等级: 🟡 中危**"

    echo "发现一些敏感信息,建议进行进一步分析。"

    else

    echo "**风险等级: 🟢 低危**"

    echo "未发现明显的敏感信息,但建议进行完整分析。"

    fi

    echo ""

    echo "## 建议下一步"

    echo "1. 查看详细报告文件"

    echo "2. 使用完整分析工具进行深度扫描"

    echo "3. 重点关注发现的敏感文件"

    echo "4. 验证固件的完整性和签名"

    echo ""

    echo "## 生成的文件"

    echo "- basic_info.txt - 基本信息"

    echo "- strings_analysis.txt - 字符串分析"

    echo "- binwalk_analysis.txt - Binwalk分析"

    if [ -f "$output_dir/config_files.txt" ]; then

    echo "- config_files.txt - 配置文件列表"

    fi

    if [ -f "$output_dir/sensitive_files.txt" ]; then

    echo "- sensitive_files.txt - 敏感文件检查"

    fi

    } > "$output_dir/SUMMARY.md"

    log_success "快速扫描完成!"

    echo ""

    echo "输出目录: $output_dir"

    echo "汇总报告: $output_dir/SUMMARY.md"

    echo ""

    # 显示摘要

    if [ -f "$output_dir/SUMMARY.md" ]; then

    echo "=== 扫描摘要 ==="

    grep -A5 "## 关键发现" "$output_dir/SUMMARY.md" | tail -10

    echo ""

    grep -A2 "## 风险评估" "$output_dir/SUMMARY.md" | tail -3

    fi

    }

    # 主函数

    main() {

    check_dependencies

    if [ $# -lt 1 ]; then

    log_error "用法: $0 <固件文件>"

    log_info "示例: $0 firmware.bin"

    exit 1

    fi

    firmware="$1"

    if [ ! -f "$firmware" ]; then

    log_error "文件不存在: $firmware"

    exit 1

    fi

    quick_scan "$firmware"

    }

    main "$@"

  • 相关推荐
    久违 °2 小时前
    【安全开发】Nmap端口发现技术详解(一)
    安全·网络安全
    Saniffer_SH2 小时前
    【高清视频】笔记本电脑出现蓝屏、死机、慢、不稳定是这样连接分析M.2 SSD的
    运维·服务器·网络·人工智能·驱动开发·嵌入式硬件·fpga开发
    乐居生活官2 小时前
    网络仿真软件哪个更高效?从Ranplan Professional到Academic的全面评估
    网络
    henujolly3 小时前
    what`s rpc
    网络·网络协议·rpc
    爱网络爱Linux3 小时前
    【技术分享】密码认证协议PAT怎么配置?
    网络·hcie·hcia·hcip·密码认证协议pat
    zyxqyy&∞3 小时前
    HCIP--多向重分布--1
    网络
    云飞云共享云桌面3 小时前
    上海模具制造工厂10人用一台共享电脑做SolidWorks设计
    linux·运维·服务器·网络·自动化
    min1811234563 小时前
    AI从工具向自主决策者的身份转变
    大数据·网络·人工智能·架构·流程图
    郝学胜-神的一滴3 小时前
    深入浅出网络协议:从OSI七层到TCP/IP五层模型全解析
    开发语言·网络·c++·网络协议·tcp/ip·程序人生