SIGSEGV (11) 段错误详解

一、信号基础认知

信号核心信息

信号编号：11
信号名称：SIGSEGV (Segmentation Violation)
POSIX 标准：是（POSIX.1-2001 定义）
可捕获：是
默认行为：终止进程并生成 coredump

核心定位

SIGSEGV 的本质作用是内存访问违规告警。当进程尝试访问不属于它的内存区域，或者访问被保护的内存区域时，操作系统会发送 SIGSEGV 信号。

默认行为

Linux 内核的默认处理逻辑：

终止进程：立即终止当前进程
生成 coredump：如果系统配置允许，会生成 core 文件用于后续调试
不可忽略：虽然可以捕获，但默认行为是终止

与 C++ 的关联性

SIGSEGV 在 C++ 开发中是最常见的崩溃信号，高发场景包括：

STL 容器使用 ：std::vector、std::string 等容器的越界访问
指针操作：空指针解引用、野指针访问、悬垂指针
多线程内存共享：数据竞争、未同步的内存访问
智能指针误用 ：std::shared_ptr、std::unique_ptr 的循环引用或提前释放
内存管理：使用已释放的内存、重复释放、内存泄漏导致的堆损坏

二、信号触发场景

核心触发原因

1. 编程失误类

场景 1.1：空指针解引用

cpp 复制代码

// 错误代码
#include <iostream>

void processData(int* data) {
    *data = 100;  // 如果 data 为 nullptr，触发 SIGSEGV
    std::cout << "Data: " << *data << std::endl;
}

int main() {
    int* ptr = nullptr;
    processData(ptr);  // 崩溃点
    return 0;
}

Coredump 信息示例：

复制代码

Program received signal SIGSEGV, Segmentation fault.
0x0000000000401123 in processData(int*) (data=0x0) at main.cpp:5
5       *data = 100;
(gdb) bt
#0  0x0000000000401123 in processData(int*) (data=0x0) at main.cpp:5
#1  0x0000000000401145 in main () at main.cpp:10

场景 1.2：数组越界访问

cpp 复制代码

// 错误代码
#include <iostream>
#include <vector>

int main() {
    std::vector<int> vec = {1, 2, 3};
    int value = vec[10];  // 越界访问，可能触发 SIGSEGV
    std::cout << value << std::endl;
    
    // 更危险的场景：使用指针算术
    int arr[5] = {1, 2, 3, 4, 5};
    int* p = arr;
    p += 100;  // 越界指针
    *p = 42;   // 触发 SIGSEGV
    return 0;
}

场景 1.3：使用已释放的内存

cpp 复制代码

// 错误代码
#include <iostream>

int main() {
    int* ptr = new int(42);
    delete ptr;
    
    // 使用已释放的内存
    *ptr = 100;  // 触发 SIGSEGV（悬垂指针）
    std::cout << *ptr << std::endl;
    
    return 0;
}

Coredump 信息示例：

复制代码

Program received signal SIGSEGV, Segmentation fault.
0x0000000000401156 in main () at main.cpp:8
8       *ptr = 100;
(gdb) info registers
rax            0x0                 0
rbx            0x5555555592a0      93824992247840
rcx            0x0                 0
rdx            0x64                100
rsi            0x7fffffffdd40      140737488346944
rdi            0x1                 1
rbp            0x7fffffffdd20      0x7fffffffdd20
rsp            0x7fffffffdd18      0x7fffffffdd18

场景 1.4：STL 容器迭代器失效

cpp 复制代码

// 错误代码
#include <iostream>
#include <vector>

int main() {
    std::vector<int> vec = {1, 2, 3, 4, 5};
    auto it = vec.begin();
    
    vec.push_back(6);  // 可能导致重新分配，迭代器失效
    
    *it = 10;  // 使用失效的迭代器，可能触发 SIGSEGV
    return 0;
}

2. 系统限制类

场景 2.1：栈溢出

cpp 复制代码

// 错误代码：递归过深导致栈溢出
#include <iostream>

void recursiveFunction(int depth) {
    int largeArray[1000];  // 栈上分配大数组
    if (depth > 0) {
        recursiveFunction(depth - 1);  // 无限递归
    }
}

int main() {
    recursiveFunction(10000);  // 栈溢出，触发 SIGSEGV
    return 0;
}

Coredump 信息示例：

复制代码

Program received signal SIGSEGV, Segmentation fault.
0x0000000000401120 in recursiveFunction(int) (depth=...) at main.cpp:5
5       int largeArray[1000];
(gdb) bt
#0  0x0000000000401120 in recursiveFunction(int) (depth=...) at main.cpp:5
#1  0x0000000000401135 in recursiveFunction(int) (depth=...) at main.cpp:7
#2  0x0000000000401135 in recursiveFunction(int) (depth=...) at main.cpp:7
... (重复多次)

3. 外部触发类

场景 3.1：多线程数据竞争

cpp 复制代码

// 错误代码：未同步的内存访问
#include <iostream>
#include <thread>
#include <vector>

int shared_data = 0;

void increment() {
    for (int i = 0; i < 100000; ++i) {
        shared_data++;  // 数据竞争，可能导致内存损坏
    }
}

int main() {
    std::vector<std::thread> threads;
    for (int i = 0; i < 10; ++i) {
        threads.emplace_back(increment);
    }
    
    for (auto& t : threads) {
        t.join();
    }
    
    // 在某些情况下，数据竞争可能导致内存损坏，触发 SIGSEGV
    std::cout << shared_data << std::endl;
    return 0;
}

4. 运行时异常类

场景 4.1：智能指针循环引用导致的内存问题

cpp 复制代码

// 错误代码：循环引用导致内存管理问题
#include <memory>
#include <iostream>

struct Node {
    std::shared_ptr<Node> next;
    int value;
    
    ~Node() {
        std::cout << "Node destroyed" << std::endl;
    }
};

int main() {
    auto node1 = std::make_shared<Node>();
    auto node2 = std::make_shared<Node>();
    
    node1->next = node2;
    node2->next = node1;  // 循环引用
    
    // 虽然这里不会直接触发 SIGSEGV，但可能导致后续内存问题
    // 如果后续代码错误地访问了已释放的内存，会触发 SIGSEGV
    return 0;
}

易混淆场景辨析

SIGSEGV vs SIGBUS

SIGSEGV：地址无效（访问了不属于进程的内存空间）

cpp 复制代码

int* ptr = nullptr;
*ptr = 42;  // SIGSEGV：地址 0x0 无效

SIGBUS：地址有效但访问权限不足（对齐错误、访问只读内存等）

cpp 复制代码

// 在某些架构上（如 SPARC）
char buffer[100];
int* aligned_ptr = (int*)(buffer + 1);  // 未对齐的地址
*aligned_ptr = 42;  // SIGBUS：地址有效但对齐错误

三、崩溃调试与定位

定位关键点

SIGSEGV 崩溃的核心排查方向：

检查指针是否为 nullptr：查看寄存器或变量值
检查数组/容器边界：确认索引是否越界
检查内存生命周期：确认内存是否已被释放
检查多线程同步：确认是否存在数据竞争
检查栈溢出：查看调用栈深度和局部变量大小

四、崩溃修复方案

分场景修复代码

场景 1：空指针解引用

快速修复：加判空

cpp 复制代码

// 修复前
void processData(int* data) {
    *data = 100;  // 崩溃点
}

// 快速修复
void processData(int* data) {
    if (data == nullptr) {
        std::cerr << "Error: null pointer" << std::endl;
        return;
    }
    *data = 100;
}

优雅修复：使用引用或智能指针

cpp 复制代码

// 优雅修复方案 1：使用引用（如果指针不能为空）
void processData(int& data) {  // 引用不能为空
    data = 100;
}

// 优雅修复方案 2：使用智能指针
#include <memory>

void processData(std::shared_ptr<int> data) {
    if (!data) {
        throw std::invalid_argument("data cannot be null");
    }
    *data = 100;
}

// 优雅修复方案 3：使用 std::optional (C++17)
#include <optional>

void processData(std::optional<int>& data) {
    if (!data.has_value()) {
        throw std::invalid_argument("data has no value");
    }
    data.value() = 100;
}

场景 2：数组越界访问

快速修复：边界检查

cpp 复制代码

// 修复前
std::vector<int> vec = {1, 2, 3};
int value = vec[10];  // 越界

// 快速修复
std::vector<int> vec = {1, 2, 3};
if (10 < vec.size()) {
    int value = vec[10];
} else {
    std::cerr << "Index out of bounds" << std::endl;
}

优雅修复：使用 at() 或范围检查

cpp 复制代码

// 优雅修复方案 1：使用 at()（会抛出异常）
#include <stdexcept>

std::vector<int> vec = {1, 2, 3};
try {
    int value = vec.at(10);  // 越界会抛出 std::out_of_range
} catch (const std::out_of_range& e) {
    std::cerr << "Index out of bounds: " << e.what() << std::endl;
}

// 优雅修复方案 2：使用范围检查函数
template<typename Container>
auto safe_at(Container& c, size_t index) -> decltype(c[0]) {
    if (index >= c.size()) {
        throw std::out_of_range("Index out of bounds");
    }
    return c[index];
}

// 使用
int value = safe_at(vec, 10);

场景 3：使用已释放的内存

快速修复：置空指针

cpp 复制代码

// 修复前
int* ptr = new int(42);
delete ptr;
*ptr = 100;  // 使用已释放的内存

// 快速修复
int* ptr = new int(42);
delete ptr;
ptr = nullptr;  // 置空，后续访问会立即发现问题
if (ptr != nullptr) {
    *ptr = 100;
}

优雅修复：使用智能指针

cpp 复制代码

// 优雅修复：使用 std::unique_ptr
#include <memory>

{
    auto ptr = std::make_unique<int>(42);
    // 自动管理内存，离开作用域自动释放
    *ptr = 100;
}  // ptr 在这里自动释放，无法再访问

// 如果需要共享所有权
auto shared_ptr = std::make_shared<int>(42);
// 多个地方可以安全地共享这个指针

场景 4：STL 容器迭代器失效

快速修复：重新获取迭代器

cpp 复制代码

// 修复前
std::vector<int> vec = {1, 2, 3};
auto it = vec.begin();
vec.push_back(4);  // 可能导致重新分配
*it = 10;  // 使用失效的迭代器

// 快速修复
std::vector<int> vec = {1, 2, 3};
vec.push_back(4);
auto it = vec.begin();  // 重新获取迭代器
*it = 10;

优雅修复：使用索引或范围 for

cpp 复制代码

// 优雅修复方案 1：使用索引
std::vector<int> vec = {1, 2, 3};
vec.push_back(4);
if (!vec.empty()) {
    vec[0] = 10;  // 使用索引，不会失效
}

// 优雅修复方案 2：使用范围 for（C++11）
std::vector<int> vec = {1, 2, 3};
vec.push_back(4);
for (auto& value : vec) {  // 范围 for 自动处理迭代器
    value = 10;
    break;  // 只修改第一个
}

修复验证

单元测试覆盖异常场景

cpp 复制代码

#include <gtest/gtest.h>

TEST(PointerTest, NullPointerHandling) {
    int* ptr = nullptr;
    EXPECT_THROW(processData(ptr), std::invalid_argument);
}

TEST(VectorTest, OutOfBoundsAccess) {
    std::vector<int> vec = {1, 2, 3};
    EXPECT_THROW(vec.at(10), std::out_of_range);
}

压测复现崩溃条件

cpp 复制代码

// 压力测试：重复触发边界条件
void stressTest() {
    for (int i = 0; i < 1000000; ++i) {
        std::vector<int> vec(1000);
        try {
            int value = vec.at(1001);  // 应该被捕获
        } catch (const std::out_of_range&) {
            // 正常处理
        }
    }
}

避坑提醒

不要忽略 SIGSEGV：捕获 SIGSEGV 后必须正确处理，不能简单地忽略，否则可能导致数据损坏
避免在信号处理函数中分配内存：信号处理函数应该是异步信号安全的
不要依赖未定义行为：即使程序没有崩溃，未定义行为也是危险的

五、长期预防策略

编码规范

C++ 开发中规避 SIGSEGV 的编码习惯：

指针使用前必判空

cpp 复制代码

if (ptr != nullptr) {
    *ptr = value;
}

禁用野指针：初始化所有指针
cpp 复制代码
```
int* ptr = nullptr;  // 总是初始化
```
优先使用引用而非指针：引用不能为空
cpp 复制代码
```
void func(int& value);  // 而不是 int* value
```
使用容器而非裸数组 ：std::vector 提供边界检查
cpp 复制代码
```
std::vector<int> arr;  // 而不是 int arr[100]
```
使用智能指针管理内存：自动管理生命周期
cpp 复制代码
```
std::unique_ptr<int> ptr = std::make_unique<int>(42);
```

编译阶段

开启防御性编译选项：

bash 复制代码

# 完整的防御性编译选项
g++ -Wall -Wextra -Werror \
    -fsanitize=address \
    -fsanitize=undefined \
    -fno-omit-frame-pointer \
    -g -O0 \
    -o program main.cpp

-Wall -Wextra：启用所有警告
-Werror：将警告视为错误
-fsanitize=address：启用 Address Sanitizer
-fsanitize=undefined：检测未定义行为
-fno-omit-frame-pointer：保留帧指针，便于调试
-g：保留调试信息
-O0：禁用优化，便于调试

测试策略

针对性测试用例设计：

cpp 复制代码

// 边界值测试
TEST(BoundaryTest, EmptyVector) {
    std::vector<int> vec;
    EXPECT_THROW(vec.at(0), std::out_of_range);
}

// 异常注入测试
TEST(ExceptionTest, NullPointerInjection) {
    int* ptr = nullptr;
    EXPECT_DEATH(*ptr = 42, ".*");
}

// 压力测试
TEST(StressTest, LargeAllocation) {
    for (int i = 0; i < 1000; ++i) {
        std::vector<int> vec(1000000);
        // 测试大量分配是否导致问题
    }
}

线上监控

提前感知崩溃风险：

cpp 复制代码

#include <signal.h>
#include <execinfo.h>
#include <iostream>

void segfaultHandler(int sig) {
    void* array[10];
    size_t size = backtrace(array, 10);
    
    // 打印调用栈
    std::cerr << "SIGSEGV caught! Stack trace:" << std::endl;
    backtrace_symbols_fd(array, size, STDERR_FILENO);
    
    // 记录日志
    // logToFile("SIGSEGV occurred", array, size);
    
    // 退出
    exit(1);
}

int main() {
    signal(SIGSEGV, segfaultHandler);
    // 程序代码
    return 0;
}

工具赋能

Clang-Tidy

bash 复制代码

# 安装
sudo apt-get install clang-tidy

# 使用
clang-tidy main.cpp -- -std=c++17

# 配置 .clang-tidy 文件
Checks: >
  -*,cppcoreguidelines-*,
  -*,readability-*,
  -*,performance-*,
  bugprone-*,
  modernize-*

Cppcheck

bash 复制代码

# 安装
sudo apt-get install cppcheck

# 使用
cppcheck --enable=all --std=c++17 main.cpp

六、拓展延伸

特性	SIGSEGV	SIGBUS
触发原因	地址无效	地址有效但访问权限不足
常见场景	空指针、越界	对齐错误、只读内存写入
可捕获	是	是
默认行为	终止+core	终止+core

进阶技巧：用户态自定义信号处理

cpp 复制代码

#include <signal.h>
#include <execinfo.h>
#include <cxxabi.h>
#include <iostream>
#include <string>
#include <sstream>

void printStackTrace() {
    void* array[50];
    int size = backtrace(array, 50);
    char** messages = backtrace_symbols(array, size);
    
    std::cerr << "Stack trace:" << std::endl;
    for (int i = 0; i < size; ++i) {
        std::cerr << "  [" << i << "] " << messages[i] << std::endl;
    }
    free(messages);
}

void segfaultHandler(int sig, siginfo_t* info, void* context) {
    std::cerr << "=== SIGSEGV Detected ===" << std::endl;
    std::cerr << "Signal: " << sig << std::endl;
    std::cerr << "Fault address: " << info->si_addr << std::endl;
    std::cerr << "Fault code: " << info->si_code << std::endl;
    
    printStackTrace();
    
    // 可以在这里保存状态、发送告警等
    // 然后退出
    exit(1);
}

int main() {
    struct sigaction sa;
    sa.sa_sigaction = segfaultHandler;
    sigemptyset(&sa.sa_mask);
    sa.sa_flags = SA_SIGINFO;
    
    sigaction(SIGSEGV, &sa, nullptr);
    
    // 程序代码
    int* ptr = nullptr;
    *ptr = 42;  // 会触发自定义处理函数
    
    return 0;
}

实际案例分享

案例 1：STL 容器迭代器失效导致的崩溃

问题描述：在高并发场景下，程序偶尔崩溃，coredump 显示 SIGSEGV。

排查过程：

GDB 分析显示崩溃在 std::vector::operator[]
检查调用栈，发现崩溃前有 push_back 操作
使用 ASan 复现问题，发现迭代器失效

根本原因：

cpp 复制代码

// 问题代码
for (auto it = vec.begin(); it != vec.end(); ++it) {
    if (condition) {
        vec.push_back(newValue);  // 导致重新分配，迭代器失效
    }
    process(*it);  // 使用失效的迭代器
}

解决方案：

cpp 复制代码

// 修复代码
for (size_t i = 0; i < vec.size(); ++i) {
    if (condition) {
        vec.push_back(newValue);
    }
    process(vec[i]);  // 使用索引，不会失效
}

案例 2：多线程数据竞争导致的内存损坏

问题描述：多线程程序在压力测试时随机崩溃。

排查过程：

使用 ThreadSanitizer 检测到数据竞争
发现多个线程同时访问共享数据结构
内存损坏导致后续访问触发 SIGSEGV

根本原因：

cpp 复制代码

// 问题代码
std::vector<int> shared_vec;

void threadFunc() {
    shared_vec.push_back(42);  // 多线程同时访问，未加锁
}

解决方案：

cpp 复制代码

// 修复代码
std::vector<int> shared_vec;
std::mutex vec_mutex;

void threadFunc() {
    std::lock_guard<std::mutex> lock(vec_mutex);
    shared_vec.push_back(42);  // 加锁保护
}

总结

SIGSEGV 是 C++ 开发中最常见的崩溃信号，主要原因是内存访问违规。通过：

预防：使用智能指针、容器、引用等现代 C++ 特性
检测：启用 Address Sanitizer、Valgrind 等工具
调试：掌握 GDB 调试技巧，分析 coredump
修复：快速修复 + 优雅修复相结合
监控：线上捕获信号并记录详细信息

可以有效减少 SIGSEGV 崩溃，提高程序稳定性。

SIGSEGV (11) 段错误详解