超越可观察性：使用 eBPF 修改系统调用行为

在我之前的两篇文章中，我们探讨了 eBPF 及其在内核空间使用探针的功能。通过实验"打开文件警戒"和"套接字监视"，我们仅仅触及了 eBPF 在不改变系统调用行为的情况下的可观察性功能的表面。现在，我们将进一步深入研究如何使用 eBPF 不仅观察系统调用行为，还能影响系统调用行为。

eBPF 超越了单纯的可观察性，使我们能够主动修改系统调用行为，从而打开了更具交互性和可控性的系统交互的大门。

eBPF 助手

eBPF 助手是内核提供的预定义函数，用于促进 eBPF 程序与内核之间的交互。它们提供对内核内部数据和函数的受控访问，使我们能够读取或修改与系统调用和其他内核操作相关的数据。

这些辅助程序有助于修改系统调用行为，使 eBPF 程序能够修改与系统调用相关的数据、更改其返回值，甚至重定向网络数据包。通过 eBPF 辅助程序，我们可以主动影响系统行为，将 eBPF 的功能从单纯的可观察性扩展到更具交互性和可控性的系统交互。

eBPF 程序使用这些辅助函数与系统或其工作环境进行交互。例如，它们可用于打印调试消息、获取系统启动以来的时间、与 eBPF 映射进行交互，或操作网络数据包。由于 eBPF 程序类型多种多样，并且它们不在同一个上下文中运行，因此每种程序类型只能调用其中一部分辅助函数。

man7.org/linux/man-p...

在今天的实验中，我们将深入研究两个不同但功能强大的 eBPF 助手的实际用途：bpf_override_return和bpf_send_signal。

该bpf_override_return辅助函数有助于修改系统调用的返回值，使我们能够根据自定义逻辑操控系统交互。另一方面，bpf_send_signal它提供了向当前任务进程发送信号的途径。这些辅助函数共同展示了 eBPF 提供的灵活性和控制力，也为深入探索修改实际应用中的系统调用行为奠定了基础。通过本次实验，我们旨在展示这些辅助函数对系统行为的深远影响，以及对 eBPF 驱动开发的更广泛意义。

特别注意bpf_override_return：

此辅助程序存在安全隐患，因此受到限制。它仅在内核使用CONFIG_BPF_KPROBE_OVERRIDE 配置选项编译时可用，并且在这种情况下，它仅适用于内核代码中标记为ALLOW_ERROR_INJECTION

的函数。 [来源]( man7.org/linux/man-p...

实验：我的珍贵秘密文件

想象一下，编写一个程序来保护我们宝贵的秘密文件，充分利用 eBPF 的全部功能。我们的重点是拦截系统openat调用，这是之前探索过的一个熟悉的实体。每当有人试图通过该系统调用访问我们的秘密文件时，我们都会介入并改变其行为，确保我们的文件始终处于严密的保护之下。

我们将为我们的文件设计一个双层安全机制。在 security 级别下level 1，修改系统调用的行为将导致EACCES向调用者返回错误，表示访问被拒绝。升级到 securitylevel 2或更高级别后，措施将变得更加严格；系统会发出 SIGKILL 信号来终止试图访问我们宝贵文件的程序，从而确保提供严密的保护措施，防止未经授权的访问。

为了确定哪些文件需要作为我们的秘密文件进行保护，我们需要一个媒介将这些信息从我们的user-space服务器传递到服务器kernel-space。为此，我们将使用 eBPF 映射来完成此任务：

如果您读过我之前的文章，那么您已经熟悉 eBPF 映射的理想用例之一，即作为内核和用户空间之间数据交换的管道。

sql 复制代码

         Kernel Space                         User Space
        +------------------+                 +-------------------+
        |                  |                 |                   |
        |   eBPF Program   |                 |  User Application |
        |                  |   eBPF Maps     |                   |
        | +-------------+  |<--------------->| +---------------+ |
        | | Probe Logic |  |    Interface    | | Map Interface | |
        | +-------------+  |                 | +---------------+ |
        |                  |                 |                   |
        +------------------+                 +-------------------+

Enter fullscreen mode Exit fullscreen mode

通过此设置，我们可以让user-space应用程序管理秘密文件列表，并kernel-space通过映射将此列表与共享。驻留在中的 eBPF 程序kernel-space将仅检查尝试打开的文件是否在映射中，如果是，则相应地更改行为。此方案为提供了一种简化的方式，使user-space和kernel-space能够基于共享数据进行交互并做出明智的决策。

内核空间

我们首先检查 内核空间应用程序的 C 代码。

代码C展示简洁明了，每个部分都很简单，之前也解释过。然而，这次的焦点集中在 eBPF 映射上，展示了一种在 eBPF 程序中管理更复杂数据关系的方法。

arduino 复制代码

// Define a key structure to hold the file name
struct key_t {
  char fname[NAME_MAX];
};

// Map to store secret files and their associated security levels
BPF_HASH(secret_files, struct key_t, int);

Enter fullscreen mode Exit fullscreen mode

在上面的代码片段中，key_t我们用一个自定义结构体作为 eBPF 映射的键类型secret_files，从而能够以更有条理的方式处理复杂的键。该结构体保存文件名，并映射到一个整数，该整数表示映射中文件的安全级别secret_files。这种方法不仅增强了数据组织能力，还方便了 eBPF 程序中多维数据的处理，展示了一种使用 eBPF 映射管理复杂数据关系的方法。

c 复制代码

#include <uapi/linux/ptrace.h>
#include <linux/sched.h>

// Define the SIGKILL signal, which instructs the system to terminate the process
#define SIGKILL 9

// Define a key structure to hold the file name
struct key_t {
  char fname[NAME_MAX];
};

// Map to store secret files and their associated security levels
BPF_HASH(secret_files, struct key_t, int);

int syscall__openat(struct pt_regs *ctx, int dfd, const char __user *filename, int flags) {
    struct key_t key = {};
    // Get current user ID and group ID
    u32 uid = bpf_get_current_uid_gid();

    // Read the file name from user space into the key structure
    bpf_probe_read_user_str(&key.fname, sizeof(key.fname), (void *)filename);
    // Look up the file name in the secret_files map to get its security level
    int *security_level = secret_files.lookup(&key);
    if (security_level != 0) {
        // Check if the user is root
        if (uid == 0) {
            bpf_trace_printk("Root user opening secret file %s \\n", key.fname);
            return 0;
        }

        bpf_trace_printk("Non-root user attempting to open secret file %s with security level %d \\n", key.fname, *security_level );
        if (*security_level == 1) {
            // Override the return value of the syscall to indicate permission denied
            bpf_override_return(ctx, -EACCES);
        } else if (*security_level > 1) {
            // If security level is gt than 1, send the SIGKILL signal to terminate the process
            bpf_send_signal(SIGKILL);
        }
    }

    return 0;
}

Enter fullscreen mode Exit fullscreen mode

除了地图之外，上述代码中值得注意的方面是bpf_override_return和的使用bpf_send_signal，我们之前已经讨论过。

向当前任务对应的线程发送bpf_send_signal(u32 sig)信号_sig_。该信号可以传递给此进程的任意线程。如果要将信号发送到当前任务对应的特定线程，请使用bpf_send_signal_thread(u32 sig)

用于bpf_override_return(struct pt_regs regs, u64 rc)错误注入，此助手覆盖探测函数的返回值（在本例中为我们的openatsyscall）。

用户空间

继续 Python 我们的应用程序的代码user-space，代码仍然很简单并且与我们之前的实验有相似之处。

ini 复制代码

from bcc import BPF
import ctypes as ct

# Helper function to add a secret file to the map
def add_secret_file(map, file):
    key = map.Key()
    key.fname = file[0].encode()
    value = ct.c_int(file[1])
    # Update the map with the new entry
    map[key] = value


def main():
    # Read BPF Program
    with open("ebpf_program.c") as f:
        bpf_program = f.read()

    # Load BPF program
    b = BPF(text=bpf_program)

    # Attach the kprobe defined in the eBPF program to the clone system call.
    fnname_openat = b.get_syscall_prefix().decode() + 'openat'
    b.attach_kprobe(event=fnname_openat, fn_name="syscall__openat")

    # Get thee map
    secret_files = b.get_table("secret_files")

    # Add the secret files to the map
    for file in [("/tmp/secret.txt", 1), ("/tmp/ultra_secret.txt", 2)]:
        add_secret_file(secret_files, file)

    try:
        print("Attaching kprobe to sys_openat... Press Ctrl+C to exit.")
        b.trace_print()
    except KeyboardInterrupt:
        pass


if __name__ == "__main__":
    main()

Enter fullscreen mode Exit fullscreen mode

在提供的脚本中，重点在于获取映射并向其中添加条目。加载 eBPF 程序后，脚本secret_files使用检索 eBPF 映射b.get_table("secret_files")。之后，循环遍历一个秘密文件列表，每个文件都表示为一个包含文件路径和相关安全级别的元组。该add_secret_file函数以映射和每个文件元组作为参数调用，其中在映射中创建一个新条目，以文件名为键，安全级别为值。

这封装了使用秘密文件条目动态更新 eBPF 映射的过程，说明了一种从用户空间与 eBPF 映射交互和修改 eBPF 映射的直接机制。

结果

为了让我们的新程序运行起来，让我们在目录中创建两个秘密文件tmp。

bash 复制代码

echo "www.kungfudev.com" > /tmp/secret.txt
echo "www.kungfudev.com" > /tmp/ultra_secret.txt

Enter fullscreen mode Exit fullscreen mode

使用命令运行程序sudo python3 app.py并尝试打开最近创建的文件后，我们应该观察到一些输出。请记住，secret.txt的安全级别为 1，而ultra_secret.txt的安全级别为 2。因此，我们应该看到以下输出：

bash 复制代码

$ cat /tmp/secret.txt 
cat: /tmp/secret.txt: Permission denied

$ cat /tmp/ultra_secret.txt 
Killed

# Since we added a root user validation in our eBPF program, we can open the file as root.
$ sudo cat /tmp/ultra_secret.txt 
www.kungfudev.com

Enter fullscreen mode Exit fullscreen mode

所有代码都可以在我的存储库中找到。

总结

通过这个简单的例子，我们说明了 eBPF 超越了可观察性，展示了它改变系统调用行为的能力，从而突出了它的多功能性和对系统级交互的强大影响。

感谢您的阅读。这篇博客是我学习之旅的一部分，非常重视您的反馈。关于 eBPF，还有更多值得探索和分享的内容，敬请期待后续文章。欢迎您分享您的见解和经验，我们将在这个领域共同学习和成长。 祝您编程愉快！