本文将从实战角度，手把手教你如何对嵌入式设备固件进行安全分析，涵盖从固件提取、文件系统分析到逆向工程挖掘漏洞的全流程。无论你是安全研究员、物联网开发者还是安全爱好者，都能从中获得实用技能。

一. 固件安全：为何如此重要？

1.1 真实世界中的固件漏洞

在开始技术细节之前，让我们先看几个触目惊心的案例：

案例一：路由器后门事件（CVE-2019-19824）

某知名品牌路由器固件被发现存在硬编码后门账户，攻击者可通过特定用户名密码直接获得设备完全控制权。这个漏洞影响超过100万台设备。

案例二：智能摄像头隐私泄露

某物联网摄像头固件中的弱加密实现，导致攻击者可以轻松解密视频流，数百万用户的家庭隐私面临风险。

案例三：工业PLC固件漏洞

某工业控制系统的PLC固件存在缓冲区溢出漏洞，攻击者可远程执行恶意代码，可能导致生产线停工甚至安全事故。

1.2 固件安全威胁全景图

1.3 学习路线图：从零到精通

本文将按照以下路径，带你逐步深入：

固件获取 → 提取解包 → 文件系统分析 → 敏感信息收集 →

逆向工程 → 漏洞挖掘 → 编写Exploit → 防护建议

二. 环境搭建：一键部署分析环境

在开始之前，我们先用Docker快速搭建一个完整的分析环境：

Dockerfile.firmware-analysis

FROM ubuntu:22.04

LABEL maintainer="firmware-security@example.com"

LABEL version="1.0"

LABEL description="固件安全分析一体化环境"

设置时区和非交互式安装

ENV TZ=Asia/Shanghai

ENV DEBIAN_FRONTEND=noninteractive

RUN apt-get update && apt-get install -y --no-install-recommends \

基础工具

build-essential \

git \

curl \

wget \

vim \

nano \

file \

tree \

固件分析核心工具

binwalk \

srecord \

flashrom \

lzop \

u-boot-tools \

mtd-utils \

模拟与调试

qemu \

qemu-user-static \

qemu-system \

gdb-multiarch \

strace \

ltrace \

逆向工程

radare2 \

hexedit \

xxd \

网络分析

net-tools \

netcat \

tcpdump \

nmap \

Python环境

python3 \

python3-pip \

python3-dev \

其他依赖

squashfs-tools \

cpio \

zlib1g-dev \

liblzma-dev \

liblzo2-dev \

&& rm -rf /var/lib/apt/lists/*

安装Python工具

RUN pip3 install --no-cache-dir \

pycryptodome \

scapy \

requests \

beautifulsoup4 \

colorama \

progressbar2

创建工作目录

WORKDIR /workspace

RUN mkdir -p /workspace/firmware /workspace/output /workspace/scripts

安装社区工具

RUN git clone https://github.com/ReFirmLabs/binwalk.git /opt/binwalk && \

cd /opt/binwalk && \

python3 setup.py install

RUN git clone https://github.com/rampageX/firmware-mod-kit.git /opt/firmware-mod-kit && \

cd /opt/firmware-mod-kit && \

./extract-firmware.sh

RUN git clone https://github.com/craigz28/firmwalker.git /opt/firmwalker

下载并安装Ghidra

RUN wget -q https://github.com/NationalSecurityAgency/ghidra/releases/download/Ghidra_10.3_build/ghidra_10.3_PUBLIC_20230829.zip && \

unzip ghidra_10.3_PUBLIC_20230829.zip -d /opt/ && \

rm ghidra_10.3_PUBLIC_20230829.zip && \

ln -s /opt/ghidra_10.3_PUBLIC /opt/ghidra

设置环境变量

ENV PATH="/opt/ghidra:/opt/firmware-mod-kit:${PATH}"

ENV GHIDRA_INSTALL_DIR="/opt/ghidra"

创建快捷启动脚本

RUN echo '#!/bin/bash\njava -jar /opt/ghidra/ghidraRun.jar' > /usr/local/bin/ghidra && \

chmod +x /usr/local/bin/ghidra

设置默认命令

CMD ["/bin/bash"]

使用以下命令构建并运行：

# 构建Docker镜像

docker build -t firmware-analysis:latest -f Dockerfile.firmware-analysis .

# 运行容器，映射本地目录

docker run -it --rm \

-v $(pwd)/firmware:/workspace/firmware \

-v $(pwd)/output:/workspace/output \

-v $(pwd)/scripts:/workspace/scripts \

--name firmware-lab \

firmware-analysis:latest

三. 固件获取：四大途径全解析

3.1 途径一：官方渠道获取

最直接的方式是从设备厂商官网下载：

#!/usr/bin/env python3

"""

固件自动下载脚本

支持多个路由器厂商的固件下载

"""

import requests

import re

import os

from urllib.parse import urljoin

from typing import List, Dict

import logging

class FirmwareDownloader:

"""自动化固件下载器"""

VENDOR_URLS = {

'tplink': 'https://www.tp-link.com.cn/download-center.html',

'asus': 'https://www.asus.com.cn/support/Download-Center/',

'netgear': 'https://www.netgear.com.cn/support/download/',

'dlink': 'http://support.dlink.com.cn/',

'mercury': 'http://service.mercurycom.com.cn/download.html'

}

def init(self, vendor: str, model: str):

self.vendor = vendor.lower()

self.model = model.upper()

self.session = requests.Session()

self.session.headers.update({

'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36'

})

def search_firmware(self) -> List[Dict]:

"""搜索指定型号的固件"""

if self.vendor not in self.VENDOR_URLS:

raise ValueError(f"不支持的路由器品牌: {self.vendor}")

base_url = self.VENDOR_URLS[self.vendor]

response = self.session.get(base_url)

# 不同厂商的页面解析逻辑不同

if self.vendor == 'tplink':

return self._parse_tplink(response.text)

elif self.vendor == 'asus':

return self._parse_asus(response.text)

# 其他厂商解析方法...

def _parse_tplink(self, html: str) -> List[Dict]:

"""解析TP-Link下载页面"""

# 实际解析逻辑会更复杂，这里简化示例

pattern = rf'href="([^"]*{self.model}[^"]*\.(?:bin|zip|rar|7z))"'

matches = re.findall(pattern, html, re.IGNORECASE)

firmwares = []

for match in matches:

firmwares.append({

'url': urljoin(self.VENDOR_URLS['tplink'], match),

'filename': os.path.basename(match),

'model': self.model,

'vendor': 'TP-Link'

})

return firmwares

def download(self, firmware_info: Dict, save_dir: str = './firmware'):

"""下载固件文件"""

os.makedirs(save_dir, exist_ok=True)

url = firmware_info['url']

filename = firmware_info['filename']

save_path = os.path.join(save_dir, filename)

print(f"正在下载: {filename}")

response = self.session.get(url, stream=True)

with open(save_path, 'wb') as f:

for chunk in response.iter_content(chunk_size=8192):

if chunk:

f.write(chunk)

print(f"下载完成: {save_path}")

return save_path

# 使用示例

if name == 'main':

downloader = FirmwareDownloader('tplink', 'WR842N')

firmwares = downloader.search_firmware()

if firmwares:

latest = firmwares[0] # 假设第一个是最新版本

downloader.download(latest)

3.2 途径二：OTA更新抓包

很多IoT设备支持在线更新，我们可以拦截更新请求：

#!/usr/bin/env python3

"""

OTA固件更新包拦截与分析

使用mitmproxy进行中间人攻击

"""

from mitmproxy import http

import re

import json

import hashlib

from pathlib import Path

import datetime

class OTAFirmwareInterceptor:

"""OTA固件拦截器"""

def init(self, output_dir="./ota_captures"):

self.output_dir = Path(output_dir)

self.output_dir.mkdir(exist_ok=True)

self.firmware_patterns = [

r'\.bin $', r'\\.img$ ', r'\.fwpkg$',

r'firmware', r'update', r'upgrade'

]

def request(self, flow: http.HTTPFlow):

"""HTTP请求拦截"""

# 检测固件更新请求

if self._is_firmware_request(flow.request):

print(f"[+] 检测到固件更新请求: {flow.request.url}")

def response(self, flow: http.HTTPFlow):

"""HTTP响应拦截"""

if self._is_firmware_response(flow.response):

self._save_firmware(flow)

def _is_firmware_request(self, request) -> bool:

"""判断是否为固件更新请求"""

url = str(request.url).lower()

content_type = request.headers.get('Content-Type', '').lower()

# 检查URL中的关键词

url_matches = any(pattern in url for pattern in ['firmware', 'update', 'upgrade'])

# 检查Content-Type

content_matches = any(pattern in content_type

for pattern in ['octet-stream', 'binary', 'zip'])

return url_matches or content_matches

def _is_firmware_response(self, response) -> bool:

"""判断响应是否为固件文件"""

content_type = response.headers.get('Content-Type', '').lower()

content_disposition = response.headers.get('Content-Disposition', '').lower()

# 常见固件Content-Type

firmware_types = [

'application/octet-stream',

'application/x-firmware',

'application/zip',

'application/x-gzip'

]

# 检查文件名

filename = None

if 'filename=' in content_disposition:

filename = content_disposition.split('filename=')[-1].strip('"\'')

# 多种判断条件

conditions = [

any(ft in content_type for ft in firmware_types),

filename and any(filename.endswith(ext)

for ext in ['.bin', '.img', '.zip', '.gz']),

len(response.content) > 1024 * 1024, # 大于1MB的文件

]

return any(conditions)

def _save_firmware(self, flow: http.HTTPFlow):

"""保存拦截到的固件"""

timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")

url_hash = hashlib.md5(flow.request.url.encode()).hexdigest()[:8]

# 尝试获取文件名

content_disposition = flow.response.headers.get('Content-Disposition', '')

if 'filename=' in content_disposition:

filename = content_disposition.split('filename=')[-1].strip('"\'')

else:

filename = f"firmware_{timestamp}_{url_hash}.bin"

save_path = self.output_dir / filename

# 保存固件文件

with open(save_path, 'wb') as f:

f.write(flow.response.content)

# 保存请求元数据

meta = {

'url': flow.request.url,

'method': flow.request.method,

'headers': dict(flow.request.headers),

'timestamp': timestamp,

'size': len(flow.response.content),

'md5': hashlib.md5(flow.response.content).hexdigest(),

'sha256': hashlib.sha256(flow.response.content).hexdigest()

}

meta_path = save_path.with_suffix('.json')

with open(meta_path, 'w', encoding='utf-8') as f:

json.dump(meta, f, indent=2, ensure_ascii=False)

print(f"[+] 固件已保存: {save_path}")

print(f" 大小: {len(flow.response.content) / 1024 / 1024:.2f} MB")

print(f" MD5: {meta['md5']}")

# 自动进行初步分析

self._quick_analysis(save_path)

def _quick_analysis(self, firmware_path: Path):

"""快速分析固件文件"""

print(f"\n[+] 开始快速分析...")

# 1. 文件类型识别

import subprocess

result = subprocess.run(['file', str(firmware_path)],

capture_output=True, text=True)

print(f" 文件类型: {result.stdout.strip()}")

# 2. 查看文件头

with open(firmware_path, 'rb') as f:

header = f.read(512)

hex_dump = ' '.join(f'{b:02x}' for b in header[:64])

print(f" 文件头(前64字节): {hex_dump}")

# 3. 搜索字符串

strings_result = subprocess.run(['strings', str(firmware_path)],

capture_output=True, text=True)

strings = strings_result.stdout.split('\n')

# 查找可能的关键词

keywords = ['root', 'admin', 'password', 'kernel', 'linux', 'vmlinux']

found = []

for keyword in keywords:

for string in strings:

if keyword.lower() in string.lower():

found.append(string[:50])

break

if found:

print(f" 发现关键词: {found[:5]}...")

# 使用mitmproxy运行

addons = [OTAFirmwareInterceptor()]

3.3 途径三：物理提取

当无法通过软件方式获取时，物理提取是最后的手段：

#!/bin/bash

# firmware_extraction.sh

# 物理提取固件的自动化脚本

set -e

# 颜色输出

RED='\033[0;31m'

GREEN='\033[0;32m'

YELLOW='\033[1;33m'

NC='\033[0m' # No Color

log_info() {

echo -e " ${GREEN}\[INFO\]$ {NC} $1"

}

log_warn() {

echo -e " ${YELLOW}\[WARN\]$ {NC} $1"

}

log_error() {

echo -e " ${RED}\[ERROR\]$ {NC} $1"

}

# 检查必要工具

check_tools() {

local tools=("flashrom" "sdparm" "dd" "strings" "binwalk" "file")

local missing=()

for tool in "${tools[@]}"; do

if ! command -v $tool &> /dev/null; then

missing+=($tool)

done

if [ ${#missing[@]} -ne 0 ]; then

log_error "缺少必要工具: ${missing[*]}"

exit 1

}

# 识别芯片类型

identify_chip() {

log_info "正在识别Flash芯片..."

# 使用flashrom检测

if sudo flashrom --programmer linux_spi:dev=/dev/spidev0.0 -r /tmp/test.bin &> /tmp/chip_info.txt; then

local chip_info=$(grep -i "found" /tmp/chip_info.txt)

log_info "芯片信息: $chip_info"

echo "$chip_info"

else

log_warn "自动识别失败，尝试手动识别"

echo "unknown"

}

# 通过编程器读取

read_with_flashrom() {

local chip=$1

local output_file=$2

log_info "使用flashrom读取芯片: $chip"

# 尝试不同的programmer

local programmers=("linux_spi:dev=/dev/spidev0.0" "ch341a_spi" "dediprog")

for programmer in "${programmers[@]}"; do

log_info "尝试使用 $programmer..."

if sudo flashrom --programmer $programmer -c "$ chip" -r "$output_file" 2>/dev/null; then

log_info "读取成功: $output_file"

return 0

done

log_error "所有编程器尝试失败"

return 1

}

# 通过调试接口读取

read_via_jtag() {

local output_file=$1

log_info "尝试通过JTAG读取..."

# 检查是否连接了JTAG调试器

if lsusb | grep -i "jtag\|openocd\|ftdi" &> /dev/null; then

log_info "检测到JTAG调试器"

# 这里需要根据具体设备编写OpenOCD脚本

cat > /tmp/jtag_read.cfg << EOF

OpenOCD配置示例

source [find interface/jlink.cfg]

transport select jtag

source [find target/stm32f1x.cfg]

init

dump_image $output_file 0x08000000 0x10000

shutdown

EOF

if openocd -f /tmp/jtag_read.cfg &> /tmp/openocd.log; then

log_info "JTAG读取成功"

return 0

return 1

}

# 通过UART读取

read_via_uart() {

local device=$1

local baudrate=$2

local output_file=$3

log_info "尝试通过UART( $device @$ baudrate)交互获取..."

# 使用expect脚本自动化UART交互

cat > /tmp/uart_read.exp << EOF

#!/usr/bin/expect -f

set timeout 10

spawn screen $device$ baudrate

expect {

"login:" { send "root\\r" }

"Password:" { send "admin\\r" }

"# " { send "cat /proc/mtd\\r" }

timeout { exit 1 }

}

expect "# "

send "dd if=/dev/mtd0 of=/tmp/firmware.bin\\r"

expect "# "

send "exit\\r"

expect eof

EOF

if expect /tmp/uart_read.exp &> /tmp/uart.log; then

# 实际中需要通过其他方式获取文件

log_info "UART交互完成，请手动提取文件"

return 0

return 1

}

# 分析提取的固件

analyze_firmware() {

local firmware_file=$1

log_info "\n===== 固件分析报告 ====="

# 基础信息

log_info "文件信息:"

file "$firmware_file"

# 大小

local size= $(stat -c%s "$ firmware_file")

log_info "文件大小: $(numfmt --to=iec$ size)"

# binwalk 分析

log_info "\nBinwalk分析:"

binwalk "$firmware_file" | head -20

# 搜索字符串

log_info "\n发现的字符串(关键词):"

# 计算哈希

log_info "\n文件哈希:"

md5sum "$firmware_file"

sha256sum "$firmware_file"

}

main() {

check_tools

local chip_type=""

local output_file="firmware_dump_$(date +%Y%m%d_%H%M%S).bin"

echo "选择提取方式:"

echo "1) Flash编程器"

echo "2) JTAG调试接口"

echo "3) UART串口"

echo "4) 自动尝试所有方式"

read -p "选择[1-4]: " method

case $method in

chip_type=$(identify_chip)

read_with_flashrom " $chip_type" "$ output_file"

;;

read_via_jtag "$output_file"

;;

read -p "输入串口设备(如/dev/ttyUSB0): " uart_device

read -p "输入波特率(如115200): " uart_baud

read_via_uart " $uart_device" "$ uart_baud" "$output_file"

;;

log_info "尝试所有提取方式..."

# 按顺序尝试

;;

log_error "无效选择"

exit 1

;;

esac

if [ -f " $output_file" \] \&\& \[ -s "$ output_file" ]; then

analyze_firmware "$output_file"

log_info "固件已保存到: $output_file"

else

log_error "固件提取失败"

}

main "$@"

四. Binwalk深度使用：从基础到高级

4.1 基础扫描与提取

#!/bin/bash

# binwalk_analysis.sh

FIRMWARE_FILE="$1"

OUTPUT_DIR="${FIRMWARE_FILE%.*}_extracted"

echo "=== 固件分析: $(basename$ FIRMWARE_FILE) ==="

# 1. 基础信息扫描

echo "[1/6] 基础信息扫描..."

binwalk -I "$FIRMWARE_FILE"

# 2. 签名扫描（识别文件类型）

echo -e "\n[2/6] 签名扫描..."

binwalk -A "$FIRMWARE_FILE" | head -20

# 3. 熵分析（检测加密/压缩）

echo -e "\n[3/6] 熵分析..."

binwalk -E "$FIRMWARE_FILE"

# 4. 递归提取（主要操作）

echo -e "\n[4/6] 递归提取文件系统..."

binwalk -Me " $FIRMWARE_FILE" -C "$ OUTPUT_DIR"

# 5. 提取特定类型文件

echo -e "\n[5/6] 提取特定类型文件..."

binwalk -D 'squashfs:unsquashfs -d %e.squashfs %e' \

-D 'gzip:gunzip %e' \

-D 'lzma:lzma -d %e' \

"$FIRMWARE_FILE" \

-C "$OUTPUT_DIR/extracted_types"

# 6. 生成分析报告

echo -e "\n[6/6] 生成分析报告..."

{

echo "# 固件分析报告"

echo "## 基本信息"

echo "- 文件名: $(basename$ FIRMWARE_FILE)"

echo "- 大小: $(du -h$ FIRMWARE_FILE | cut -f1)"

echo "- 分析时间: $(date)"

echo ""

echo "## 文件结构"

binwalk "$FIRMWARE_FILE"

echo ""

echo "## 提取的文件"

find "$OUTPUT_DIR" -type f | head -30

} > "$OUTPUT_DIR/analysis_report.md"

echo -e "\n=== 分析完成 ==="

echo "输出目录: $OUTPUT_DIR"

echo "报告文件: $OUTPUT_DIR/analysis_report.md"

4.2 高级技巧：自定义签名与自动化

创建自定义签名文件 custom.sig：

# custom.sig - 自定义固件签名

# 格式: 文件类型 | 扩展名 | 偏移 | 十六进制签名

# 自定义文件头签名

# 示例：识别特定厂商的固件格式

# 1. TP-Link固件签名（示例）

# 通常以 TP-LINK 开头

0 tp-link-header .tlh 0 string TP-LINK

# 2. 特定加密格式识别

# 某些设备使用自定义加密，识别其魔数

0 custom-encrypted .enc 0 hex {89 43 52 59 50 54}

# 3. U-Boot镜像识别

# U-Boot通常有特定的头部结构

0 u-boot-image .uboot 0 string U-Boot

# 4. 文件系统偏移识别

# 当文件系统不在开头时

1024 squashfs-lzma .sfs 0 string sqsh

# 5. 压缩格式识别

# 特定压缩算法

0 lzma-custom .clz 0 hex {5D 00 00 80 00}

# 6. 版本信息识别

# 在特定偏移寻找版本字符串

512 version-string .ver 0 string Version:

使用自定义签名：

# 使用自定义签名文件

binwalk -f custom.sig firmware.bin

# 结合内置签名

binwalk -S custom.sig -M firmware.bin

4.3 Python API自动化分析

#!/usr/bin/env python3

"""

Binwalk Python API自动化分析框架

"""

import binwalk

import os

import json

import hashlib

from pathlib import Path

from typing import Dict, List, Any

import logging

class AdvancedFirmwareAnalyzer:

"""高级固件分析器"""

def init(self, firmware_path: str):

self.firmware_path = Path(firmware_path)

self.output_dir = self.firmware_path.parent / f"{self.firmware_path.stem}_analysis"

self.output_dir.mkdir(exist_ok=True)

# 设置日志

logging.basicConfig(

level=logging.INFO,

format='%(asctime)s - %(levelname)s - %(message)s',

handlers=[

logging.FileHandler(self.output_dir / 'analysis.log'),

logging.StreamHandler()

]

)

self.logger = logging.getLogger(name)

def comprehensive_scan(self) -> Dict[str, Any]:

"""执行全面扫描"""

self.logger.info(f"开始分析固件: {self.firmware_path.name}")

results = {

'file_info': self._get_file_info(),

'entropy_analysis': self._entropy_analysis(),

'signature_scan': self._signature_scan(),

'strings_analysis': self._strings_analysis(),

'extraction_results': self._extract_files()

}

# 保存结果

report_path = self.output_dir / 'comprehensive_report.json'

with open(report_path, 'w', encoding='utf-8') as f:

json.dump(results, f, indent=2, ensure_ascii=False)

self.logger.info(f"分析完成，报告已保存: {report_path}")

return results

def _get_file_info(self) -> Dict[str, Any]:

"""获取文件基本信息"""

import magic

import subprocess

info = {}

info['path'] = str(self.firmware_path)

info['size'] = os.path.getsize(self.firmware_path)

info['md5'] = self._calculate_hash('md5')

info['sha256'] = self._calculate_hash('sha256')

# 文件类型

try:

mime = magic.Magic(mime=True)

info['mime_type'] = mime.from_file(str(self.firmware_path))

file_type = magic.Magic()

info['file_type'] = file_type.from_file(str(self.firmware_path))

except:

info['file_type'] = subprocess.run(

'file', str(self.firmware_path)\], capture_output=True, text=True ).stdout.strip() return info def _calculate_hash(self, algorithm: str) -\> str: """计算文件哈希""" hash_func = getattr(hashlib, algorithm)() with open(self.firmware_path, 'rb') as f: for chunk in iter(lambda: f.read(4096), b''): hash_func.update(chunk) return hash_func.hexdigest() def _entropy_analysis(self) -\> Dict\[str, Any\]: """熵分析（检测加密/压缩）""" import math with open(self.firmware_path, 'rb') as f: data = f.read() if not data: return {'error': '文件为空'} *#* *计算字节频率* byte_count = {} total_bytes = len(data) for byte in data: byte_count\[byte\] = byte_count.get(byte, 0) + 1 *#* *计算熵值* entropy = 0.0 for count in byte_count.values(): probability = count / total_bytes if probability \> 0: entropy -= probability \* math.log2(probability) *#* *分析结果* analysis = { 'entropy': entropy, 'max_entropy': 8.0, *#* *完全随机数据的熵* 'entropy_percentage': (entropy / 8.0) \* 100, 'interpretation': '' } *#* *根据熵值给出解释* if entropy \< 1.0: analysis\['interpretation'\] = '低熵 - 可能是未压缩的文本或重复数据' elif entropy \< 6.0: analysis\['interpretation'\] = '中等熵 - 可能是压缩数据或代码' else: analysis\['interpretation'\] = '高熵 - 可能是加密数据或高度压缩' *#* *生成熵分布图数据* chunk_size = 1024 entropy_points = \[

for i in range(0, len(data), chunk_size):

chunk = data[i:i + chunk_size]

if len(chunk) < chunk_size:

break

chunk_entropy = 0.0

chunk_count = {}

for byte in chunk:

chunk_count[byte] = chunk_count.get(byte, 0) + 1

for count in chunk_count.values():

prob = count / chunk_size

if prob > 0:

chunk_entropy -= prob * math.log2(prob)

entropy_points.append({

'offset': i,

'entropy': chunk_entropy

})

analysis['entropy_distribution'] = entropy_points[:100] # 只取前100个点

return analysis

def _signature_scan(self) -> List[Dict[str, Any]]:

"""签名扫描"""

self.logger.info("执行签名扫描...")

try:

# 使用binwalk模块进行扫描

module = binwalk.Module(

quiet=False,

verbose=True,

signature=True,

extract=False,

directory=str(self.output_dir / 'extracted')

)

# 扫描文件

module.scan(str(self.firmware_path))

results = []

for result in module.results:

results.append({

'offset': result.offset,

'description': result.description,

'file_type': getattr(result, 'file_type', 'unknown'),

'compression': getattr(result, 'compression', False),

'encrypted': getattr(result, 'encrypted', False)

})

return results

except Exception as e:

self.logger.error(f"签名扫描失败: {e}")

return []

def _strings_analysis(self) -> Dict[str, List[str]]:

"""字符串分析"""

import subprocess

self.logger.info("分析字符串...")

strings_result = {

'potential_credentials': [],

'urls': [],

'ip_addresses': [],

'file_paths': [],

'interesting_strings': []

}

# 提取字符串

try:

result = subprocess.run(

'strings', '-n', '4', str(self.firmware_path)\], capture_output=True, text=True, timeout=60 ) all_strings = result.stdout.split('\\n') import re *#* *搜索密码和凭证* password_patterns = \[ r'password\\s\*\[:=\]\\s\*(\[\^\\s;\]+)', r'passwd\\s\*\[:=\]\\s\*(\[\^\\s;\]+)', r'admin\\s\*\[:=\]\\s\*(\[\^\\s;\]+)', r'root\\s\*\[:=\]\\s\*(\[\^\\s;\]+)', r'\[A-Za-z0-9+/\]{20,}={0,2}', *# Base64* *编码的可能凭证*

for pattern in password_patterns:

for string in all_strings:

matches = re.findall(pattern, string, re.IGNORECASE)

strings_result['potential_credentials'].extend(matches)

# 搜索URL

url_pattern = r'https?://[^\s<>"\'{}|\\^`\[\]]+'

for string in all_strings:

urls = re.findall(url_pattern, string)

strings_result['urls'].extend(urls)

# 搜索IP地址

ip_pattern = r'\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b'

for string in all_strings:

ips = re.findall(ip_pattern, string)

strings_result['ip_addresses'].extend(ips)

# 搜索文件路径

path_patterns = [

r'/etc/[^\s<>"\'{}|\\^`\[\]]+',

r'/bin/[^\s<>"\'{}|\\^`\[\]]+',

r'/sbin/[^\s<>"\'{}|\\^`\[\]]+',

r'/usr/[^\s<>"\'{}|\\^`\[\]]+',

]

for pattern in path_patterns:

for string in all_strings:

paths = re.findall(pattern, string)

strings_result['file_paths'].extend(paths)

# 其他有趣字符串（关键字）

keywords = [

'backdoor', 'debug', 'test', 'secret',

'key', 'token', 'auth', 'login',

'kernel', 'init', 'startup',

'vulnerability', 'exploit',

'CVE-', '0day', 'hack'

]

for keyword in keywords:

for string in all_strings:

if keyword.lower() in string.lower():

strings_result['interesting_strings'].append(string)

break

# 去重

for key in strings_result:

strings_result[key] = list(set(strings_result[key]))[:50] # 每个类别最多50条

except subprocess.TimeoutExpired:

self.logger.warning("字符串提取超时")

except Exception as e:

self.logger.error(f"字符串分析错误: {e}")

return strings_result

def _extract_files(self) -> Dict[str, Any]:

"""提取文件"""

self.logger.info("提取文件系统...")

extract_dir = self.output_dir / 'extracted'

extract_dir.mkdir(exist_ok=True)

try:

# 使用binwalk提取

module = binwalk.Module(

quiet=False,

verbose=True,

signature=False,

extract=True,

directory=str(extract_dir),

matryoshka=True, # 递归提取

rm=False # 不删除临时文件

)

module.scan(str(self.firmware_path))

# 收集提取结果

extracted_files = []

for root, dirs, files in os.walk(extract_dir):

for file in files:

filepath = os.path.join(root, file)

rel_path = os.path.relpath(filepath, extract_dir)

try:

file_type = subprocess.run(

'file', '-b', filepath\], capture_output=True, text=True ).stdout.strip() extracted_files.append({ 'path': rel_path, 'size': os.path.getsize(filepath), 'type': file_type, 'md5': self._calculate_file_hash(filepath, 'md5') }) except: continue return { 'extract_dir': str(extract_dir), 'file_count': len(extracted_files), 'files': extracted_files\[:100\], *#* *只显示前100个文件* 'total_size': sum(f\['size'\] for f in extracted_files) } except Exception as e: self.logger.error(f"文件提取失败: {e}") return {'error': str(e)} def _calculate_file_hash(self, filepath: str, algorithm: str) -\> str: """计算单个文件的哈希""" hash_func = getattr(hashlib, algorithm)() try: with open(filepath, 'rb') as f: for chunk in iter(lambda: f.read(4096), b''): hash_func.update(chunk) return hash_func.hexdigest() except: return 'error' def generate_html_report(self): """生成HTML格式的详细报告""" import html data = self.comprehensive_scan() html_content = f''' \ \ \ \ \ \固件分析报告 - {html.escape(self.firmware_path.name)}\ \ body {{ font-family: Arial, sans-serif; margin: 20px; line-height: 1.6; }} .container {{ max-width: 1200px; margin: 0 auto; }} .header {{ background: #f5f5f5; padding: 20px; border-radius: 5px; margin-bottom: 20px; }} .section {{ margin-bottom: 30px; padding: 20px; border: 1px solid #ddd; border-radius: 5px; }} .section-title {{ color: #333; border-bottom: 2px solid #007bff; padding-bottom: 10px; }} .danger {{ color: #dc3545; font-weight: bold; }} .warning {{ color: #ffc107; font-weight: bold; }} .success {{ color: #28a745; font-weight: bold; }} pre {{ background: #f8f9fa; padding: 10px; border-radius: 3px; overflow: auto; }} table {{ width: 100%; border-collapse: collapse; margin: 10px 0; }} th, td {{ border: 1px solid #ddd; padding: 8px; text-align: left; }} th {{ background: #f2f2f2; }} tr:nth-child(even) {{ background: #f9f9f9; }} \ \ \ \

\固件安全分析报告\ \\文件名:\ {html.escape(data\['file_info'\].get('path', 'N/A'))}\ \\分析时间:\ {html.escape(str(data.get('analysis_time', 'N/A')))}\ \ \

文件基本信息\ \ \\项目\\值\\ \\文件大小\\{data\['file_info'\].get('size', 'N/A')} 字节\\ \\MD5哈希\\\{data\['file_info'\].get('md5', 'N/A')}\\\ \\SHA256哈希\\\{data\['file_info'\].get('sha256', 'N/A')}\\\ \\文件类型\\{html.escape(data\['file_info'\].get('file_type', 'N/A'))}\\ \ \ \
\
熵分析\ \\熵值:\ {data\['entropy_analysis'\].get('entropy', 'N/A'):.4f} / 8.0\ \\百分比:\ {data\['entropy_analysis'\].get('entropy_percentage', 'N/A'):.1f}%\ \\解释:\ {html.escape(data\['entropy_analysis'\].get('interpretation', 'N/A'))}\ \熵分布图\ \
\ \ // 这里可以添加Chart.js代码来绘制熵分布图 // 由于篇幅限制，实际实现时需要引入Chart.js库 \ \ \
\
签名扫描结果\ \ \ \偏移\ \描述\ \文件类型\ \状态\ \ ''' for sig in data\['signature_scan'\]\[:20\]: # 只显示前20条 html_content += f''' \ \0x{sig.get('offset', 0):X}\ \{html.escape(sig.get('description', 'N/A'))}\ \{html.escape(sig.get('file_type', 'N/A'))}\ \ {'\加密\' if sig.get('encrypted') else '\正常\' if not sig.get('compressed') else '\压缩\'} \ \ ''' html_content += ''' \ \ \
\
字符串分析\ ''' for category, items in data\['strings_analysis'\].items(): if items: html_content += f''' \{category.replace('_', ' ').title()}\ \ ''' for item in items\[:10\]: # 每个类别只显示前10条 escaped_item = html.escape(str(item)) # 检查是否为潜在危险凭证 if category == 'potential_credentials': html_content += f'\
{escaped_item}\' else: html_content += f'\{escaped_item}\' html_content += '\' html_content += ''' \ \
\
提取的文件\ \\提取目录:\ {}\ \\文件总数:\ {}\ \\总大小:\ {:.2f} MB\ \部分文件列表\ \ \ \路径\ \大小\ \类型\ \MD5\ \ '''.format( html.escape(data\['extraction_results'\].get('extract_dir', 'N/A')), data\['extraction_results'\].get('file_count', 0), data\['extraction_results'\].get('total_size', 0) / 1024 / 1024 ) for file_info in data\['extraction_results'\].get('files', \[\])\[:15\]: html_content += f''' \ \{html.escape(file_info.get('path', 'N/A'))}\ \{file_info.get('size', 0)} 字节\ \{html.escape(file_info.get('type', 'N/A'))}\ \\{file_info.get('md5', 'N/A')}\\ \ ''' html_content += ''' \ \ \
\
安全建议\ \ \检查所有发现的潜在凭证，确认是否存在硬编码密码\ \分析提取的可执行文件中的漏洞\ \验证固件的签名和完整性校验机制\ \审查网络服务和开放端口配置\ \ \ \
\
免责声明\ \本报告仅用于安全研究和教育目的。在进行任何安全测试前，请确保您已获得适当的授权。\ \ \ \ \ ''' report_path = self.output_dir / 'report.html' with open(report_path, 'w', encoding='utf-8') as f: f.write(html_content) self.logger.info(f"HTML报告已生成: {report_path}") return str(report_path) # 使用示例 if name == "main": analyzer = AdvancedFirmwareAnalyzer("firmware.bin") # 生成完整分析报告 results = analyzer.comprehensive_scan() # 生成HTML报告 html_report = analyzer.generate_html_report() print(f"\\n分析完成！") print(f"- JSON报告: {analyzer.output_dir}/comprehensive_report.json") print(f"- HTML报告: {html_report}") print(f"- 日志文件: {analyzer.output_dir}/analysis.log") ### 五. 敏感信息挖掘：自动化搜索与分析 5.1 自动化敏感信息搜索系统 #!/usr/bin/env python3 """ 固件敏感信息自动化搜索系统支持多种类型的敏感信息检测 """ import os import re import json import hashlib import logging from pathlib import Path from typing import Dict, List, Set, Tuple, Optional import mimetypes import subprocess class SensitiveInfoHunter: """敏感信息猎手""" # 预定义的正则表达式模式 PATTERNS = { # 硬编码凭证 'hardcoded_passwords': \[ r'(?i)(?:password\|passwd\|pwd)\[=:\\s\]+\[\\'"\]?(\[\^\\'"\\s\]{4,})', r'(?i)admin\\s\\[=:\]\\s\\[\\'"\]?(\[\^\\'"\\s\]{3,})', r'(?i)root\\s\\[=:\]\\s\\[\\'"\]?(\[\^\\'"\\s\]{3,})', r'(?i)(?:user\|username)\\s\\[=:\]\\s\\[\\'"\]?(\[\^\\'"\\s\]{3,})', \], # API 密钥和令牌 'api_keys': \[ # AWS 密钥 r'(?:AKIA\|ASIA)\[A-Z0-9\]{16}', # Google API 密钥 r'AIza\[0-9A-Za-z\\-_\]{35}', # GitHub 令牌 r'ghp_\[a-zA-Z0-9\]{36}', # JWT 令牌 r'eyJ\[a-zA-Z0-9\]{10,}\\.\[a-zA-Z0-9\]{10,}\\.\[a-zA-Z0-9_\\-\]{10,}', \], # 加密密钥 'crypto_keys': \[ # RSA 私钥 r'-----BEGIN (?:RSA\|EC\|DSA) PRIVATE KEY-----', # SSH 私钥 r'-----BEGIN OPENSSH PRIVATE KEY-----', # PGP 私钥 r'-----BEGIN PGP PRIVATE KEY BLOCK-----', # 通用私钥 r'-----BEGIN PRIVATE KEY-----', \], # 数据库连接字符串 'database_connections': \[ # MySQL 连接 r'mysql://\[\^:\\s\]+:\[\^@\\s\]+@\[\^\\s/\]+/\[\^\\s?\]+', # PostgreSQL 连接 r'postgres(?:ql)?://\[\^:\\s\]+:\[\^@\\s\]+@\[\^\\s/\]+/\[\^\\s?\]+', # MongoDB 连接 r'mongodb(?:+srv)?://\[\^:\\s\]+:\[\^@\\s\]+@\[\^\\s/\]+', \], # 配置文件敏感信息 'config_secrets': \[ # 通用配置格式 r'(?i)(?:secret\|key\|token\|credential)\[\\s=:\]+\[\\'"\]?(\[\^\\'"\\s\]{8,})', # JSON 格式 r'\["\\'\]?(?:secret\|key\|token)\["\\'\]?\\s\:\\s\\["\\'\](\[\^\\'"\]{8,})\["\\'\]', # XML 格式 r'\<(?:secret\|key\|token)\[\^\>\]\\>(\[\^\<\]{8,})\', \], #* 调试信息和后门 'debug_backdoors': \[ # 调试端口 r'debug\\s\\[=:\]\\s\\[\\'"\]?(?:true\|yes\|on\|1)\[\\'"\]?', # 后门命令 r'(?i)(?:backdoor\|debug\|test)_?(?:mode\|shell\|cmd)', # 硬编码的调试凭证 r'(?:debug\|test)\[=:\]\\s\\[\\'"\]?(?:admin\|root)\[\\'"\]?', \], } #* 敏感文件扩展名 SENSITIVE_EXTENSIONS = { '.pem', '.key', '.crt', '.pfx', '.p12', # 证书和密钥 '.db', '.sqlite', '.sqlite3', '.db3', # 数据库文件 '.env', '.config', '.cfg', '.conf', # 配置文件 '.sh', '.bash', '.zsh', '.fish', # Shell 脚本 '.py', '.php', '.js', '.java', '.c', # 源代码文件 '.xml', '.json', '.yml', '.yaml', # 配置文件 '.log', '.txt', '.md', # 文本文件 '.history', '.bash_history', '.zsh_history', # 历史文件 } # 敏感路径关键词 SENSITIVE_PATHS = { '/etc/passwd', '/etc/shadow', '/etc/sudoers', '/root/', '/home/', '/var/log/', '/tmp/', '.ssh/', '.aws/', '.config/', '.local/', 'wp-config.php', 'config.php', 'settings.py', } def init(self, search_path: str, output_dir: str = "./reports"): self.search_path = Path(search_path) self.output_dir = Path(output_dir) self.output_dir.mkdir(exist_ok=True) # 结果存储 self.findings = { category: \[\] for category in self.PATTERNS.keys() } self.findings\['sensitive_files'\] = \[
self.findings['statistics'] = {}

# 设置日志

self._setup_logging()

def _setup_logging(self):

"""设置日志系统"""

log_file = self.output_dir / 'hunter.log'

logging.basicConfig(

level=logging.INFO,

format='%(asctime)s - %(levelname)s - %(message)s',

handlers=[

logging.FileHandler(log_file),

logging.StreamHandler()

]

)

self.logger = logging.getLogger(name)

def scan(self) -> Dict:

"""执行完整扫描"""

self.logger.info(f"开始扫描: {self.search_path}")

# 1. 扫描文件

self._scan_files()

# 2. 扫描二进制文件中的字符串

self._scan_binaries()

# 3. 分析配置文件

self._analyze_configs()

# 4. 生成报告

report = self._generate_report()

self.logger.info(f"扫描完成，发现 {self._count_findings()} 个敏感信息")

return report

def _scan_files(self):

"""扫描文件系统中的文件"""

self.logger.info("扫描文件系统...")

for root, dirs, files in os.walk(self.search_path):

# 跳过一些目录

dirs[:] = [d for d in dirs if not d.startswith('.') and d not in {'tmp', 'temp'}]

for file in files:

filepath = Path(root) / file

rel_path = str(filepath.relative_to(self.search_path))

# 检查是否敏感文件

if self._is_sensitive_file(filepath):

self.findings['sensitive_files'].append({

'path': rel_path,

'reason': '敏感扩展名',

'size': filepath.stat().st_size

})

# 检查路径是否敏感

for sensitive_path in self.SENSITIVE_PATHS:

if sensitive_path in rel_path:

self.findings['sensitive_files'].append({

'path': rel_path,

'reason': f'敏感路径包含: {sensitive_path}',

'size': filepath.stat().st_size

})

break

# 扫描文件内容

self._scan_file_content(filepath, rel_path)

def _is_sensitive_file(self, filepath: Path) -> bool:

"""判断是否为敏感文件"""

# 检查扩展名

if filepath.suffix.lower() in self.SENSITIVE_EXTENSIONS:

return True

# 检查文件名

sensitive_names = {

'passwd', 'shadow', 'sudoers', 'hosts',

'authorized_keys', 'id_rsa', 'id_dsa',

'credentials', 'config', 'settings',

'.env', '.gitconfig', '.npmrc',

}

if filepath.name in sensitive_names or filepath.name.startswith('.'):

return True

return False

def _scan_file_content(self, filepath: Path, rel_path: str):

"""扫描文件内容"""

try:

# 检查文件类型

mime_type = mimetypes.guess_type(str(filepath))[0]

# 只处理文本文件和常见的配置文件

if mime_type and ('text/' in mime_type or

mime_type in {'application/json',

'application/xml',

'application/yaml'}):

content = filepath.read_text(encoding='utf-8', errors='ignore')

self._analyze_text_content(content, rel_path, 'text')

# 处理二进制文件（使用strings）

elif filepath.stat().st_size < 10 * 1024 * 1024: # 小于10MB

try:

result = subprocess.run(
$'strings', str(filepath)\], capture_output=True, text=True, timeout=10 ) if result.returncode == 0: self._analyze_text_content(result.stdout, rel_path, 'binary') except: pass except Exception as e: self.logger.debug(f"扫描文件失败 {rel_path}: {e}") def _scan_binaries(self): """专门扫描二进制文件""" self.logger.info("深度扫描二进制文件...") binary_extensions = {'.bin', '.elf', '.so', '.dll', '.exe'} for root, dirs, files in os.walk(self.search_path): for file in files: if Path(file).suffix.lower() in binary_extensions: filepath = Path(root) / file self._deep_scan_binary(filepath) def _deep_scan_binary(self, filepath: Path): """深度扫描二进制文件""" try: # 使用更高级的strings参数 result = subprocess.run( \['strings', '-n', '6', '-t', 'x', str(filepath)\], capture_output=True, text=True, timeout=30 ) if result.returncode == 0: content = result.stdout rel_path = str(filepath.relative_to(self.search_path)) # 分析二进制字符串 self._analyze_text_content(content, rel_path, 'binary_deep') # 额外的二进制分析 self._analyze_binary_patterns(filepath, rel_path) except subprocess.TimeoutExpired: self.logger.warning(f"二进制扫描超时: {filepath}") except Exception as e: self.logger.debug(f"二进制扫描失败 {filepath}: {e}") def _analyze_binary_patterns(self, filepath: Path, rel_path: str): """分析二进制文件的特定模式""" try: # 读取文件头 with open(filepath, 'rb') as f: header = f.read(512) # 检查常见漏洞模式 # 1. 硬编码IP地址 ip_pattern = rb'\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}' ips = re.findall(ip_pattern, header) for ip in ips\[:5\]: # 最多5个 self.findings\['debug_backdoors'\].append({ 'type': '硬编码IP', 'value': ip.decode('ascii', errors='ignore'), 'source': rel_path, 'context': '文件头部' }) # 2. 检查调试符号 debug_strings = \[b'debug', b'test', b'backdoor', b'password'$
for debug_str in debug_strings:

if debug_str in header.lower():

self.findings['debug_backdoors'].append({

'type': '调试字符串',

'value': debug_str.decode('ascii'),

'source': rel_path,

'context': '文件头部'

})

except Exception as e:

self.logger.debug(f"二进制模式分析失败: {e}")

def _analyze_configs(self):

"""分析配置文件"""

config_files = []

for finding in self.findings['sensitive_files']:

if 'config' in finding['path'].lower() or finding['path'].endswith('.conf'):

config_files.append(finding['path'])

self.logger.info(f"发现 {len(config_files)} 个配置文件，进行深度分析...")

for config_path in config_files[:20]: # 最多分析20个

try:

full_path = self.search_path / config_path

content = full_path.read_text(encoding='utf-8', errors='ignore')

# 分析特定格式的配置文件

if config_path.endswith('.json'):

self._analyze_json_config(content, config_path)

elif config_path.endswith('.yml') or config_path.endswith('.yaml'):

self._analyze_yaml_config(content, config_path)

elif config_path.endswith('.xml'):

self._analyze_xml_config(content, config_path)

except Exception as e:

self.logger.debug(f"配置文件分析失败 {config_path}: {e}")

def _analyze_json_config(self, content: str, source: str):

"""分析JSON配置文件"""

try:

import json as json_module

config = json_module.loads(content)

# 递归搜索敏感键值

self._search_json_recursive(config, source, [])

except:

# 如果不是有效的JSON，使用正则表达式搜索

self._analyze_text_content(content, source, 'config')

def _search_json_recursive(self, data, source: str, path: List[str]):

"""递归搜索JSON中的敏感信息"""

if isinstance(data, dict):

for key, value in data.items():

current_path = path + [key]

# 检查键名

key_str = '.'.join(current_path)

if any(sensitive in key_str.lower()

for sensitive in ['pass', 'secret', 'key', 'token']):

if value and isinstance(value, str) and len(value) > 4:

self.findings['config_secrets'].append({

'type': 'JSON配置',

'key': key_str,

'value': value[:50] + ('...' if len(value) > 50 else ''),

'source': source

})

# 递归搜索

if isinstance(value, (dict, list)):

self._search_json_recursive(value, source, current_path)

elif isinstance(data, list):

for i, item in enumerate(data):

self._search_json_recursive(item, source, path + [str(i)])

def _analyze_yaml_config(self, content: str, source: str):

"""分析YAML配置文件"""

try:

import yaml

config = yaml.safe_load(content)

self._search_json_recursive(config, source, [])

except:

self._analyze_text_content(content, source, 'config')

def _analyze_xml_config(self, content: str, source: str):

"""分析XML配置文件"""

# 使用正则表达式搜索XML中的敏感信息

patterns = [

r'<(?:password|passwd|secret|key)[^>]>([^<]+)</(?:password|passwd|secret|key)>',

r'(?:password|secret|key)="([^"]+)"',

]

for pattern in patterns:

matches = re.findall(pattern, content, re.IGNORECASE)

for match in matches:

if len(match) > 4:

self.findings['config_secrets'].append({

'type': 'XML配置',

'value': match[:50] + ('...' if len(match) > 50 else ''),

'source': source

})

def _analyze_text_content(self, content: str, source: str, file_type: str):

"""分析文本内容"""

for category, patterns in self.PATTERNS.items():

for pattern in patterns:

try:

matches = re.findall(pattern, content, re.MULTILINE | re.IGNORECASE)

for match in matches:

# 处理不同的匹配格式*

if isinstance(match, tuple):

# 如果模式有分组，取第一个分组

value = match[0] if match[0] else ''.join(match[1:])

else:

value = match

# 跳过空值或太短的值

if not value or len(str(value).strip()) < 4:

continue

# 添加上下文信息

context = self._get_context(content, str(value))

self.findings[category].append({

'value': str(value)[:100], # 限制长度

'source': source,

'type': file_type,

'context': context,

'pattern': pattern[:50]

})

except Exception as e:

self.logger.debug(f"模式匹配失败 {pattern}: {e}")

def _get_context(self, content: str, target: str, context_lines: int = 2) -> str:

"""获取匹配内容的上下文"""

try:

lines = content.split('\n')

for i, line in enumerate(lines):

if target in line:

start = max(0, i - context_lines)

end = min(len(lines), i + context_lines + 1)

context = '\n'.join(lines[start:end])

return context[:500] # 限制长度

except:

pass

return target[:100]

def _count_findings(self) -> int:

"""计算发现的总数"""

total = 0

for category, items in self.findings.items():

if category != 'statistics':

total += len(items)

return total

def _generate_report(self) -> Dict:

"""生成详细报告"""

self.logger.info("生成报告...")

# 统计信息

stats = {

'total_files_scanned': len(self.findings['sensitive_files']) + 100, # 估算值

'total_findings': self._count_findings(),

'by_category': {},

'risk_assessment': {}

}

for category, items in self.findings.items():

if category != 'statistics':

stats['by_category'][category] = len(items)

# 风险评估

risk_score = 0

if stats['by_category'].get('hardcoded_passwords', 0) > 0:

risk_score += 10 * stats['by_category']['hardcoded_passwords']

if stats['by_category'].get('crypto_keys', 0) > 0:

risk_score += 8 * stats['by_category']['crypto_keys']

if stats['by_category'].get('debug_backdoors', 0) > 0:

risk_score += 7 * stats['by_category']['debug_backdoors']

if risk_score > 50:

risk_level = "严重"

elif risk_score > 20:

risk_level = "高危"

elif risk_score > 10:

risk_level = "中危"

else:

risk_level = "低危"

stats['risk_assessment'] = {

'score': risk_score,

'level': risk_level,

'recommendation': self._get_recommendation(risk_level)

}

self.findings['statistics'] = stats

# 保存报告

self._save_reports()

return self.findings

def _get_recommendation(self, risk_level: str) -> str:

"""根据风险等级给出建议"""

recommendations = {

"严重": "立即修复！发现硬编码密码和加密密钥，系统面临严重威胁。",

"高危": "尽快修复！发现多个敏感信息泄露风险。",

"中危": "建议修复！存在一些安全风险。",

"低危": "风险较低，但建议进行代码审查。"

}

return recommendations.get(risk_level, "请进行详细的安全评估。")

def _save_reports(self):

"""保存各种格式的报告"""

# 1. JSON 报告

json_path = self.output_dir / 'sensitive_info_report.json'

with open(json_path, 'w', encoding='utf-8') as f:

json.dump(self.findings, f, indent=2, ensure_ascii=False, default=str)

# 2. Markdown 报告

md_path = self.output_dir / 'report.md'

self._generate_markdown_report(md_path)

# 3. CSV 报告（方便导入其他工具）

csv_path = self.output_dir / 'findings.csv'

self._generate_csv_report(csv_path)

self.logger.info(f"报告已保存到: {self.output_dir}")

def _generate_markdown_report(self, path: Path):

"""生成Markdown格式报告"""

md_content = f"""# 固件敏感信息扫描报告

概览

扫描路径: `{self.search_path}`

扫描时间: {self.findings['statistics'].get('scan_time', 'N/A')}

风险等级: {self.findings['statistics']['risk_assessment']['level']}

风险评分: {self.findings['statistics']['risk_assessment']['score']}

统计信息

"""

# 统计表格

md_content += "| 类别 | 发现数量 | 风险等级 |\n"

md_content += "|------|----------|----------|\n"

for category, count in self.findings['statistics']['by_category'].items():

if count > 0:

# 确定风险等级

if category in ['hardcoded_passwords', 'crypto_keys']:

risk = "🔥 高危"

elif category in ['api_keys', 'debug_backdoors']:

risk = "⚠️ 中危"

else:

risk = "ℹ️ 低危"

md_content += f"| {category} | {count} | {risk} |\n"

# 详细发现

for category, items in self.findings.items():

if category != 'statistics' and items:

md_content += f"\n## {category.replace('_', ' ').title()}\n\n"

for i, item in enumerate(items[:20], 1): # 每个类别最多显示20个

md_content += f"### 发现 #{i}\n"

md_content += f"- 值: `{item.get('value', 'N/A')}`\n"

md_content += f"- 来源: `{item.get('source', 'N/A')}`\n"

md_content += f"- 类型: {item.get('type', 'N/A')}\n"

if item.get('context'):

md_content += f"- 上下文:\n```\n{item['context']}\n```\n"

md_content += "\n"

# 建议

md_content += f"""## 安全建议

{self.findings['statistics']['risk_assessment']['recommendation']}

具体建议：

立即行动：

移除所有硬编码的密码和密钥

轮换所有泄露的API密钥和令牌

禁用或移除调试后门

短期改进：

实施安全的密钥管理方案

对配置文件进行加密

添加访问控制和审计日志

长期规划：

建立安全开发流程（SDL）

定期进行安全代码审查

实施自动化安全测试

免责声明

本报告仅用于安全研究和教育目的。在进行任何安全测试前，请确保您已获得适当的授权。

报告生成时间: {self.findings['statistics'].get('scan_time', 'N/A')}

"""

with open(path, 'w', encoding='utf-8') as f:

f.write(md_content)

def _generate_csv_report(self, path: Path):

"""生成CSV格式报告"""

import csv

with open(path, 'w', newline='', encoding='utf-8') as f:

writer = csv.writer(f)

# 写入标题

writer.writerow(['类别', '值', '来源', '类型', '上下文', '风险等级'])

# 写入数据

for category, items in self.findings.items():

if category != 'statistics':

for item in items:

# 确定风险等级

if category in ['hardcoded_passwords', 'crypto_keys']:

risk = 'HIGH'

elif category in ['api_keys', 'debug_backdoors']:

risk = 'MEDIUM'

else:

risk = 'LOW'

writer.writerow([

category,

item.get('value', ''),

item.get('source', ''),

item.get('type', ''),

item.get('context', '').replace('\n', '\\n'),

risk

])

# 使用示例

if name == "main":

import sys

if len(sys.argv) < 2:

print("用法: python sensitive_info_hunter.py <固件路径> [输出目录]")

sys.exit(1)

firmware_path = sys.argv[1]

output_dir = sys.argv[2] if len(sys.argv) > 2 else "./reports"

if not os.path.exists(firmware_path):

print(f"错误: 路径不存在: {firmware_path}")

sys.exit(1)

print(f"开始扫描: {firmware_path}")

print(f"输出目录: {output_dir}")

print("-" * 50)

hunter = SensitiveInfoHunter(firmware_path, output_dir)

report = hunter.scan()

print(f"\n扫描完成！")

print(f"发现敏感信息总数: {report['statistics']['total_findings']}")

print(f"风险等级: {report['statistics']['risk_assessment']['level']}")

print(f"报告文件:")

print(f" - JSON: {output_dir}/sensitive_info_report.json")

print(f" - Markdown: {output_dir}/report.md")

print(f" - CSV: {output_dir}/findings.csv")

print(f" - 日志: {output_dir}/hunter.log")

5.2 固件敏感信息快速扫描脚本

#!/bin/bash

# firmware_secrets_scan.sh - 快速扫描固件中的敏感信息

set -e

# 颜色定义

RED='\033[0;31m'

GREEN='\033[0;32m'

YELLOW='\033[1;33m'

BLUE='\033[0;34m'

NC='\033[0m' # No Color

# 输出函数

log_info() { echo -e " ${BLUE}\[INFO\]$ {NC} $1"; }

log_success() { echo -e " ${GREEN}\[SUCCESS\]$ {NC} $1"; }

log_warning() { echo -e " ${YELLOW}\[WARNING\]$ {NC} $1"; }

log_error() { echo -e " ${RED}\[ERROR\]$ {NC} $1"; }

# 检查依赖

check_dependencies() {

local deps=("binwalk" "strings" "grep" "find" "file")

local missing=()

for dep in "${deps[@]}"; do

if ! command -v $dep &> /dev/null; then

missing+=("$dep")

fi

done

if [ ${#missing[@]} -ne 0 ]; then

log_error "缺少依赖: ${missing[]}"

exit 1

fi

}

# 快速扫描函数*

quick_scan() {

local firmware="$1"

local output_dir=" ${firmware%.\}_secrets_$ (date +%Y%m%d_%H%M%S)"

log_info "开始快速扫描: $(basename$ firmware)"

mkdir -p "$output_dir"

# 1.* 基本信息

log_info "收集基本信息..."

{

echo "=== 固件基本信息 ==="

echo "文件名: $(basename$ firmware)"

echo "大小: $(du -h "$ firmware" | cut -f1)"

echo "MD5: $(md5sum "$ firmware" | cut -d' ' -f1)"

echo "SHA256: $(sha256sum "$ firmware" | cut -d' ' -f1)"

echo "文件类型: $(file -b "$ firmware")"

echo ""

} > "$output_dir/basic_info.txt"

# 2. 直接字符串搜索

log_info "搜索硬编码字符串..."

{

echo "=== 硬编码字符串搜索 ==="

echo ""

echo "1. 搜索密码相关字符串:"

strings "$firmware" | grep -iE "password|passwd|pwd|admin|root|login" | head -20

echo ""

echo "2. 搜索URL和IP地址:"

strings "$firmware" | grep -E "https?://|ftp://|[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}" | head -20

echo ""

echo "3. 搜索调试和测试字符串:"

strings "$firmware" | grep -iE "debug|test|backdoor|shell|console" | head -20

echo ""

echo "4. 搜索版本信息:"

strings "$firmware" | grep -iE "version|v[0-9]\.[0-9]|build|release" | head -20

echo ""

echo "5. 搜索可能的密钥:"

strings "$firmware" | grep -E "[A-Za-z0-9+/]{20,}={0,2}" | head -20

} > "$output_dir/strings_analysis.txt"

# 3. 快速binwalk分析

log_info "执行binwalk分析..."

{

echo "=== Binwalk快速分析 ==="

echo ""

binwalk -q "$firmware" | head -30

} > "$output_dir/binwalk_analysis.txt"

# 4. 提取文件并搜索配置文件

log_info "提取文件系统并搜索配置文件..."

extract_dir="$output_dir/extracted"

mkdir -p "$extract_dir"

if binwalk -eq -C " $extract_dir" "$ firmware" &> /dev/null; then

{

echo "=== 配置文件搜索 ==="

echo ""

echo "1. 查找配置文件:"

find "$extract_dir" -type f $ -name ".conf" -o -name ".cfg" -o -name ".config" $ | head -20

echo ""

echo "2. 查找包含密码的文件:"

find "$extract_dir" -type f -exec grep -l -i "password" {} \; | head -20

echo ""

echo "3. 查找证书和密钥文件:"

find "$extract_dir" -type f $ -name ".pem" -o -name ".key" -o -name ".crt" $ | head -20

echo ""

echo "4. 查找Shell脚本:"

find "$extract_dir" -type f -name ".sh" | head -20

} > "$output_dir/config_files.txt"

# 5.* 检查特定的敏感文件

log_info "检查已知敏感文件..."

{

echo "=== 已知敏感文件检查 ==="

echo ""

sensitive_files=(

"/etc/passwd"

"/etc/shadow"

"/etc/sudoers"

"/root/.bash_history"

"/root/.ssh/id_rsa"

".env"

"config.php"

"wp-config.php"

"settings.py"

)

for sensitive in "${sensitive_files[@]}"; do

if find " $extract_dir" -path "\$ sensitive" -type f | grep -q .; then

echo "⚠️ 发现: $sensitive"

find " $extract_dir" -path "\$ sensitive" -type f

echo ""

fi

done

} > "$output_dir/sensitive_files.txt"

else

log_warning "文件提取失败，跳过文件系统分析"

fi

# 6. 生成汇总报告

log_info "生成汇总报告..."

{

echo "# 固件敏感信息快速扫描报告"

echo "## 扫描信息"

echo "- 固件文件: $(basename$ firmware)"

echo "- 扫描时间: $(date)"

echo "- 扫描模式: 快速扫描"

echo ""

echo "## 关键发现"

echo ""

# 检查是否有敏感发现

sensitive_count=0

# 检查字符串分析中的密码

if grep -q "password\|admin\|root" "$output_dir/strings_analysis.txt"; then

echo "🔴 发现硬编码凭证"

grep -i "password\|admin\|root" "$output_dir/strings_analysis.txt" | head -5 | while read line; do

echo " - $line"

done

((sensitive_count++))

echo ""

fi

# 检查配置文件

if [ -f " $output_dir/config_files.txt" \] \&\& \[ -s "$ output_dir/config_files.txt" ]; then

config_count= $(grep -c "\\.conf\\\|\\.cfg\\\|\\.config" "$ output_dir/config_files.txt" 2>/dev/null || echo 0)

if [ "$config_count" -gt 0 ]; then

echo "🟡 发现配置文件 ($config_count 个)"

grep "\.conf\|\.cfg\|\.config" "$output_dir/config_files.txt" | head -3 | while read line; do

echo " - $line"

done

echo ""

fi

fi

# 检查敏感文件

if [ -f " $output_dir/sensitive_files.txt" \] \&\& grep -q "⚠️" "$ output_dir/sensitive_files.txt"; then

echo "🔴 发现敏感文件"

grep "发现:" "$output_dir/sensitive_files.txt" | while read line; do

echo " - $line"

done

((sensitive_count+=2))

echo ""

fi

echo "## 风险评估"

if [ "$sensitive_count" -ge 3 ]; then

echo "风险等级: 🔴 高危"

echo "发现多个敏感信息，建议立即进行详细分析。"

elif [ "$sensitive_count" -ge 1 ]; then

echo "风险等级: 🟡 中危"

echo "发现一些敏感信息，建议进行进一步分析。"

else

echo "风险等级: 🟢 低危"

echo "未发现明显的敏感信息，但建议进行完整分析。"

fi

echo ""

echo "## 建议下一步"

echo "1. 查看详细报告文件"

echo "2. 使用完整分析工具进行深度扫描"

echo "3. 重点关注发现的敏感文件"

echo "4. 验证固件的完整性和签名"

echo ""

echo "## 生成的文件"

echo "- basic_info.txt - 基本信息"

echo "- strings_analysis.txt - 字符串分析"

echo "- binwalk_analysis.txt - Binwalk分析"

if [ -f "$output_dir/config_files.txt" ]; then

echo "- config_files.txt - 配置文件列表"

fi

if [ -f "$output_dir/sensitive_files.txt" ]; then

echo "- sensitive_files.txt - 敏感文件检查"

fi

} > "$output_dir/SUMMARY.md"

log_success "快速扫描完成！"

echo ""

echo "输出目录: $output_dir"

echo "汇总报告: $output_dir/SUMMARY.md"

echo ""

# 显示摘要

if [ -f "$output_dir/SUMMARY.md" ]; then

echo "=== 扫描摘要 ==="

grep -A5 "## 关键发现" "$output_dir/SUMMARY.md" | tail -10

echo ""

grep -A2 "## 风险评估" "$output_dir/SUMMARY.md" | tail -3

fi

}

# 主函数

main() {

check_dependencies

if [ $# -lt 1 ]; then

log_error "用法: $0 <固件文件>"

log_info "示例: $0 firmware.bin"

exit 1

fi

firmware="$1"

if [ ! -f "$firmware" ]; then

log_error "文件不存在: $firmware"

exit 1

fi

quick_scan "$firmware"

}

main "$@"

上一篇：【K-Means深度探索（十）】进阶思考：K-Medoids与Fuzzy C-Means，K-Means的“亲戚”们！
下一篇：熠智AI+Milvus:从Embedding 到数据处理、问题重写，电商AI客服架构怎么搭？
相关推荐
叶落阁主
6 小时前
Tailscale 完全指南：从入门到私有 DERP 部署
运维·安全·远程工作

用户96237795448
2 天前
DVWA 靶场实验报告 (High Level)
安全

数据智能老司机
2 天前
用于进攻性网络安全的智能体 AI——在 n8n 中构建你的第一个 AI 工作流
人工智能·安全·agent
数据智能老司机
2 天前
用于进攻性网络安全的智能体 AI——智能体 AI 入门
人工智能·安全·agent
用户96237795448
2 天前
DVWA 靶场实验报告 (Medium Level)
安全

red1giant_star
2 天前
S2-067 漏洞复现：Struts2 S2-067 文件上传路径穿越漏洞
安全

用户96237795448
3 天前
DVWA Weak Session IDs High 的 Cookie dvwaSession 为什么刷新不出来？
安全

cipher
4 天前
ERC-4626 通胀攻击：DeFi 金库的"捐款陷阱"
前端·后端·安全

一次旅行
7 天前
网络安全总结
安全·web安全

DianSan_ERP
7 天前
电商API接口全链路监控：构建坚不可摧的线上运维防线
大数据·运维·网络·人工智能·git·servlet
热门推荐
01GitHub 镜像站点 02OpenClaw 使用和管理 MCP 完全指南 03OpenClaw + 飞书（Feishu）环境搭建指南 04OpenClaw优化飞书API 额度已耗尽问题 05Claude Code + GLM4.7 避坑指南：解决 Unable to connect to Anthropic services 06本地部署 OpenClaw + DeepSeek-R1 完全指南 07小黑课堂计算机二级WPSoffice题库软件下载安装教程（2026年3月最新版）08Window 10部署openclaw报错node.exe : npm error code 128 09OpenClaw大龙虾机器人完整安装教程 10让 Trae IDE 智能体 “读懂”文档 Excel+PDF+DOCX ：mcp-documents-reader 工具使用指南

2.4上、固件安全分析与漏洞挖掘：从提取到逆向的完整实战指南

一. 固件安全：为何如此重要？

二. 环境搭建：一键部署分析环境

Dockerfile.firmware-analysis

设置时区和非交互式安装

基础工具

固件分析核心工具

模拟与调试

逆向工程

网络分析

Python环境

其他依赖

安装Python工具

创建工作目录

安装社区工具

下载并安装Ghidra

设置环境变量

创建快捷启动脚本

设置默认命令

三. 固件获取：四大途径全解析

OpenOCD配置示例

四. Binwalk深度使用：从基础到高级

概览

统计信息

具体建议：

免责声明