Linux 虚拟文件系统的建立与使用全过程解析

文章目录

- [一、VFS 的建立过程](#一、VFS 的建立过程)
- - [1.1 内核初始化（系统启动时）](#1.1 内核初始化（系统启动时）)
  - - [1.1.1 VFS 核心数据结构初始化](#1.1.1 VFS 核心数据结构初始化)
  - [1.2 注册文件系统类型（内核模块加载时）](#1.2 注册文件系统类型（内核模块加载时）)
  - - [1.2.1 文件系统注册](#1.2.1 文件系统注册)
    - [1.2.2 查看已注册的文件系统](#1.2.2 查看已注册的文件系统)
  - [1.3 挂载文件系统（创建 super_block）](#1.3 挂载文件系统（创建 super_block）)
  - - [1.3.1 挂载流程](#1.3.1 挂载流程)
    - [1.3.2 super_block 的创建细节](#1.3.2 super_block 的创建细节)
    - [1.3.3 从磁盘读取超级块](#1.3.3 从磁盘读取超级块)
  - [1.4 挂载命名空间建立（vfsmount）](#1.4 挂载命名空间建立（vfsmount）)
- [二、VFS 的使用过程](#二、VFS 的使用过程)
- - [2.1 系统调用入口](#2.1 系统调用入口)
  - [2.2 路径解析（核心步骤）](#2.2 路径解析（核心步骤）)
  - - [2.2.1 路径解析细节](#2.2.1 路径解析细节)
    - [2.2.2 dentry 查找（dcache 优先）](#2.2.2 dentry 查找（dcache 优先）)
    - [2.2.3 目录查询（以 ext4 为例）](#2.2.3 目录查询（以 ext4 为例）)
  - [2.3 创建 file 对象](#2.3 创建 file 对象)
  - [2.4 绑定到进程](#2.4 绑定到进程)
  - [2.5 后续 I/O 操作](#2.5 后续 I/O 操作)
- 三、完整流程时序图
- [四、查看 VFS 状态的工具](#四、查看 VFS 状态的工具)
- - [4.1 查看挂载信息](#4.1 查看挂载信息)
  - [4.2 查看 dentry 和 inode 缓存](#4.2 查看 dentry 和 inode 缓存)
  - [4.3 查看 slab 缓存](#4.3 查看 slab 缓存)
  - [4.4 查看进程的 fd 表](#4.4 查看进程的 fd 表)
  - [4.5 跟踪系统调用](#4.5 跟踪系统调用)
- 五、关键理解
- - [5.1 建立过程的特点](#5.1 建立过程的特点)
  - [5.2 使用过程的特点](#5.2 使用过程的特点)
  - [5.3 性能优化点](#5.3 性能优化点)
- [六、VFS 对象的存储位置总结](#六、VFS 对象的存储位置总结)
- 结语

本文详细讲解 Linux 虚拟文件系统（VFS）从建立到使用的完整过程，包括内核初始化、文件系统注册、挂载、路径解析以及 I/O 操作的每一个关键步骤。

一、VFS 的建立过程

VFS 的建立不是"一次性"的，而是分层次、按需建立的过程：

复制代码

内核启动 → 注册文件系统类型 → 挂载文件系统 → 创建 VFS 对象

1.1 内核初始化（系统启动时）

1.1.1 VFS 核心数据结构初始化

内核启动时，会初始化 VFS 所需的全局数据结构和缓存池：

c 复制代码

// 内核启动代码（简化版）
// fs/inode.c
void __init inode_init_early(void)
{
    // 初始化 inode 缓存池（slab 分配器）
    inode_cachep = kmem_cache_create("inode_cache",
                                     sizeof(struct inode),
                                     0, SLAB_HWCACHE_ALIGN,
                                     NULL);
}

// fs/dcache.c
void __init vfs_caches_init(void)
{
    // 初始化 dentry 缓存池
    dentry_cache = kmem_cache_create("dentry_cache",
                                     sizeof(struct dentry),
                                     0, SLAB_HWCACHE_ALIGN,
                                     NULL);
    
    // 初始化 file 缓存池
    filp_cachep = kmem_cache_create("filp",
                                    sizeof(struct file),
                                    0, SLAB_HWCACHE_ALIGN,
                                    NULL);
    
    // 初始化全局哈希表
    inode_hashtable = alloc_large_system_hash("Inode", ...);
    dentry_hashtable = alloc_large_system_hash("dentry", ...);
}

此时建立的内容：

slab 缓存池（inode_cache、dentry_cache、filp_cachep）
全局哈希表（inode_hashtable、dentry_hashtable）
超级块链表（super_blocks）
文件系统类型链表（file_systems）

但还没有：

具体的 super_block 实例（需要挂载时才创建）
具体的 inode、dentry 实例（需要访问文件时才创建）

1.2 注册文件系统类型（内核模块加载时）

每种文件系统（ext4、xfs、tmpfs 等）在注册后，才能被 VFS 使用。

1.2.1 文件系统注册

c 复制代码

// ext4 文件系统的注册代码（fs/ext4/super.c）
static struct file_system_type ext4_fs_type = {
    .owner    = THIS_MODULE,
    .name     = "ext4",
    .mount    = ext4_mount,      // 挂载回调函数
    .kill_sb  = kill_block_super, // 卸载回调函数
};

// 模块初始化时调用
static int __init ext4_init_fs(void)
{
    // 注册到 VFS 的 file_systems 链表
    return register_filesystem(&ext4_fs_type);
}

注册后的全局链表：

复制代码

file_systems (全局链表)
    ↓
ext4_fs_type → name: "ext4", mount: ext4_mount
    ↓
xfs_fs_type  → name: "xfs", mount: xfs_mount
    ↓
tmpfs_fs_type → name: "tmpfs", mount: tmpfs_mount
    ↓
... (数十种文件系统)

1.2.2 查看已注册的文件系统

bash 复制代码

# 查看内核支持的文件系统
cat /proc/filesystems

# 输出示例：
nodev   sysfs
nodev   rootfs
nodev   tmpfs
        ext4
        xfs
nodev   proc
...

nodev 的含义：表示该文件系统不需要块设备（如 tmpfs、procfs）

1.3 挂载文件系统（创建 super_block）

当执行 mount /dev/sda1 /mnt 时，VFS 才会真正创建 super_block 实例。

1.3.1 挂载流程

c 复制代码

// 系统调用：mount("/dev/sda1", "/mnt", "ext4", 0, NULL)
// 内核路径：fs/namespace.c → do_mount() → do_new_mount()

// 1. 根据文件系统类型查找已注册的 file_system_type
struct file_system_type *type = get_fs_type("ext4");

// 2. 调用文件系统的 mount 回调
struct dentry *root = type->mount(type, 0, "/dev/sda1", data);

// 3. 在 mount 回调中（fs/ext4/super.c）
struct dentry *ext4_mount(struct file_system_type *fs_type,
                          int flags,
                          const char *dev_name,
                          void *data)
{
    // 创建 super_block
    struct super_block *s = mount_bdev(get_tree_bdev,
                                       fs_type, flags, data);
    
    // 读取磁盘超级块到内存
    ext4_fill_super(s, data, silent);
    
    return dget(s->s_root);  // 返回根目录 dentry
}

1.3.2 super_block 的创建细节

c 复制代码

// fs/super.c
struct super_block *alloc_super(struct file_system_type *type, int flags)
{
    // 1. 分配内存
    struct super_block *s = kzalloc(sizeof(struct super_block), GFP_KERNEL);
    
    // 2. 初始化
    s->s_type = type;              // 指向 file_system_type
    s->s_op = type->sb_ops;        // 设置操作函数表
    s->s_flags = flags;
    INIT_LIST_HEAD(&s->s_mounts);  // 初始化挂载点链表
    s->s_inodes = LIST_HEAD_INIT;  // 初始化 inode 链表
    
    // 3. 加入全局链表
    list_add(&s->s_instances, &super_blocks);
    
    return s;
}

1.3.3 从磁盘读取超级块

c 复制代码

// fs/ext4/super.c
int ext4_fill_super(struct super_block *sb, void *data, int silent)
{
    // 1. 读取磁盘第 1024 字节处的超级块
    struct ext4_super_block *es;
    struct buffer_head *bh;
    
    bh = sb_bread(sb, 1024 >> blocksize_bits);
    es = (struct ext4_super_block *)bh->b_data;
    
    // 2. 验证魔数
    if (le16_to_cpu(es->s_magic) != EXT4_SUPER_MAGIC)
        return -EINVAL;
    
    // 3. 填充内存 super_block
    sb->s_blocksize = blocksize;
    sb->s_maxbytes  = EXT4_MAX_INODE_SIZE;
    sb->s_time_gran = 1000;
    
    // 4. 读取根目录 inode（inode 号 = 2）
    struct inode *root = ext4_iget(sb, EXT4_ROOT_INO);
    sb->s_root = d_make_root(root);
    
    return 0;
}

此时的 VFS 状态：

复制代码

super_block (已创建)
    ├─ s_type → ext4_fs_type
    ├─ s_op → ext4_super_ops
    ├─ s_root → dentry("/")
    │            └─ d_inode → inode(根目录)
    └─ s_inodes → [inode 链表]

1.4 挂载命名空间建立（vfsmount）

除了 super_block，还需要创建 vfsmount 来描述挂载点。

c 复制代码

// 创建 vfsmount
struct mount *mnt = alloc_vfsmount("/mnt");

// 设置挂载点信息
mnt->mnt_mountpoint = dentry_mnt;  // 父文件系统的 dentry
mnt->mnt_root = s->s_root;         // 新文件系统的根 dentry
mnt->mnt_sb = s;                   // 指向 super_block

挂载完成后的结构：

复制代码

父文件系统（如 /）
    └─ dentry_mnt ("/mnt 目录)
         └─ 挂载点
              └─ vfsmount (新挂载)
                   ├─ mnt_root → dentry("/")
                   └─ mnt_sb → super_block(ext4)

二、VFS 的使用过程

VFS 建立后，用户空间通过系统调用使用 VFS。以 open("/mnt/file.txt") 为例：

2.1 系统调用入口

c 复制代码

// 用户空间
int fd = open("/mnt/file.txt", O_RDONLY);

// 内核空间（fs/open.c）
SYSCALL_DEFINE3(open, const char __user *, filename, int, flags, umode_t, mode)
{
    return do_sys_open(AT_FDCWD, filename, flags, mode);
}

// fs/open.c
long do_sys_open(int dfd, const char __user *, int flags, umode_t mode)
{
    // 1. 从用户空间复制文件名到内核
    char *tmpname = getname(filename);
    
    // 2. 执行文件打开
    struct file *f = do_filp_open(dfd, tmpname, flags);
    
    // 3. 分配文件描述符
    int fd = get_unused_fd_flags(flags);
    
    // 4. 绑定 fd 和 file
    fd_install(fd, f);
    
    return fd;
}

2.2 路径解析（核心步骤）

路径解析是 VFS 最复杂的部分，需要逐级查找 dentry 和 inode。

c 复制代码

// fs/namei.c
struct file *do_filp_open(int dfd, struct filename *pathname, struct open_flags *op)
{
    // 1. 初始化路径解析状态
    struct nameidata nd;
    path_init(&nd, pathname->name, flags, dfd);
    
    // 2. 逐级解析路径
    // "/mnt/file.txt" → 解析 "/" → "mnt" → "file.txt"
    int err = path_lookupat(&nd);
    
    // 3. 获取目标 inode
    struct inode *inode = nd.path.dentry->d_inode;
    
    // 4. 创建 file 对象
    struct file *filp = do_dentry_open(nd.path.dentry, inode, op);
    
    return filp;
}

2.2.1 路径解析细节

c 复制代码

// fs/namei.c
static int path_lookupat(struct nameidata *nd)
{
    struct path path;
    struct qstr this;  // 当前路径分量
    
    // 1. 获取起始 dentry（从 pwd 或根目录开始）
    path.mnt = nd->path.mnt;
    path.dentry = nd->path.dentry;
    
    // 2. 循环解析每个路径分量
    while (!path_end(&nd)) {
        // 提取下一个分量（如 "mnt"）
        walk_component(&nd, &path);
        
        // 3. 查找 dentry（先查缓存，再查磁盘）
        struct dentry *dentry = lookup_real(path.dentry, &this, flags);
        
        // 4. 更新当前路径
        path.dentry = dentry;
    }
    
    return 0;
}

2.2.2 dentry 查找（dcache 优先）

c 复制代码

// fs/dcache.c
struct dentry *d_lookup(struct dentry *parent, struct qstr *name)
{
    // 1. 计算哈希值
    unsigned long hash = d_hash(name);
    
    // 2. 查找 dentry 哈希表
    struct dentry *dentry = __d_lookup(parent, hash, name);
    
    if (dentry)
        return dentry;  // 缓存命中！
    
    // 3. 缓存未命中，需要查询磁盘目录
    return real_lookup(parent, name);
}

// real_lookup() 查询磁盘目录
static struct dentry *real_lookup(struct dentry *parent, struct qstr *name)
{
    struct inode *dir = parent->d_inode;
    
    // 1. 调用目录的 lookup 操作（具体文件系统实现）
    struct dentry *result = dir->i_op->lookup(dir, name, NULL);
    
    // 2. 将新 dentry 加入 dcache
    d_add(result, inode);
    
    return result;
}

2.2.3 目录查询（以 ext4 为例）

c 复制代码

// fs/ext4/namei.c
struct dentry *ext4_lookup(struct inode *dir, struct dentry *dentry, unsigned int flags)
{
    struct ext4_dir_entry_2 *de;
    struct buffer_head *bh;
    
    // 1. 读取目录文件内容（磁盘上的 dirent 数组）
    bh = ext4_find_entry(dir, &dentry->d_name, &de);
    
    if (!bh)
        return ERR_PTR(-ENOENT);
    
    // 2. 获取 inode 号
    ext4_fsblk_t ino = le32_to_cpu(de->inode);
    
    // 3. 读取 inode（从磁盘 inode table）
    struct inode *inode = ext4_iget(dir->i_sb, ino);
    
    // 4. 关联 dentry 和 inode
    d_add(dentry, inode);
    
    return dentry;
}

2.3 创建 file 对象

路径解析完成后，创建 struct file：

c 复制代码

// fs/open.c
struct file *do_dentry_open(struct dentry *dentry, struct inode *inode, ...)
{
    // 1. 分配 file 对象
    struct file *f = get_empty_filp();
    
    // 2. 初始化
    f->f_path.dentry = dentry;
    f->f_path.mnt = path.mnt;
    f->f_inode = inode;
    f->f_op = inode->i_fop;  // 继承 inode 的操作函数
    f->f_flags = flags;
    f->f_pos = 0;
    
    // 3. 调用具体文件系统的 open 回调
    if (f->f_op->open)
        f->f_op->open(inode, f);
    
    return f;
}

2.4 绑定到进程

c 复制代码

// fs/file.c
void fd_install(int fd, struct file *file)
{
    struct files_struct *files = current->files;
    
    // 将 file 指针存入 fd 表
    rcu_assign_pointer(files->fd[fd], file);
}

此时进程的文件描述符表：

复制代码

task_struct->files
    └─ fd[3] → struct file
                    ├─ f_path.dentry → dentry("file.txt")
                    ├─ f_path.mnt → vfsmount(/mnt)
                    ├─ f_inode → inode(file.txt)
                    └─ f_op → ext4_file_operations

2.5 后续 I/O 操作

c 复制代码

// read(fd, buf, size)
SYSCALL_DEFINE3(read, unsigned int, fd, char __user *, buf, size_t, count)
{
    struct file *file = fget(fd);  // 通过 fd 查找 file
    
    // 调用 file 的 read 操作
    ssize_t ret = vfs_read(file, buf, count, &file->f_pos);
    
    fput(file);
    return ret;
}

// vfs_read() 调用具体文件系统的实现
ssize_t vfs_read(struct file *file, char __user *, size_t count, loff_t *pos)
{
    // 通过函数表调用 ext4 的 read 实现
    return file->f_op->read(file, buf, count, pos);
}

三、完整流程时序图

复制代码

┌─────────────┐  ┌──────────────┐  ┌─────────────────────────────────────┐
│  用户空间   │  │   VFS 层     │  │       具体文件系统 (ext4)          │
└─────────────┘  └──────────────┘  └─────────────────────────────────────┘
       │                │                          │
       │ mount()        │                          │
       │───────────────>│                          │
       │                │  get_fs_type("ext4")     │
       │                │─────────────────────────>│
       │                │                          │ 创建 super_block
       │                │  type->mount()           │
       │                │─────────────────────────>│
       │                │                          │ 读取磁盘超级块
       │                │                          │ 创建根 inode
       │                │<─────────────────────────│ 返回根 dentry
       │                │                          │
       │                │ 创建 vfsmount            │
       │                │                          │
       │<───────────────│                          │
       │                │                          │
       │ open()         │                          │
       │───────────────>│                          │
       │                │ path_lookupat()          │
       │                │ 查找 dentry (dcache)     │
       │                │─────────────────────────>│ 查询磁盘目录
       │                │<─────────────────────────│ 返回 inode
       │                │                          │
       │                │ 创建 struct file         │
       │                │                          │
       │<───────────────│                          │
       │                │                          │
       │ read()         │                          │
       │───────────────>│                          │
       │                │ file->f_op->read()       │
       │                │─────────────────────────>│ 读取磁盘数据
       │                │<─────────────────────────│ 返回数据
       │<───────────────│                          │
       │                │                          │

四、查看 VFS 状态的工具

4.1 查看挂载信息

bash 复制代码

# 查看所有挂载点
mount

# 或
cat /proc/mounts

4.2 查看 dentry 和 inode 缓存

bash 复制代码

# dcache 统计
cat /proc/sys/fs/dentry-state
# 输出：6432  4800  32  0  0  0  (nr_dentry, nr_unused, ...)

# inode 统计
cat /proc/sys/fs/inode-nr
# 输出：1234 567 (nr_inodes, nr_free)

4.3 查看 slab 缓存

bash 复制代码

cat /proc/slabinfo | grep -E '^(inode_cache|dentry|filp|ext4_inode_cache)'

4.4 查看进程的 fd 表

bash 复制代码

ls -l /proc/<pid>/fd/

4.5 跟踪系统调用

bash 复制代码

# 使用 strace 跟踪 open 调用
strace -e open,openat,read,write -p <pid>

五、关键理解

5.1 建立过程的特点

分层建立：内核初始化 → 注册文件系统 → 挂载 → 创建对象
按需创建：不是一次性创建所有对象，而是访问时才创建
缓存复用：super_block 和 inode 会长期缓存，dentry 按需回收

5.2 使用过程的特点

路径解析是核心 ：80% 的复杂度在 path_lookupat()
dcache 加速查找：首次访问查磁盘，后续访问直接用缓存
函数表实现多态 ：VFS 通过 f_op、i_op、s_op 调用具体实现

5.3 性能优化点

dcache 命中率：决定路径解析速度
inode 缓存：减少磁盘 inode 读取
预读机制：ext4 会预读相邻数据块

六、VFS 对象的存储位置总结

对象	存储位置	分配方式	是否有磁盘对应	缓存机制	生命周期
super_block	内核内存	`kmalloc`	有（超级块）	全局链表	mount ~ umount
inode	内核内存	`kmem_cache_alloc`	有（磁盘 inode）	inode cache	首次访问 ~ 回收
dentry	内核内存	`kmem_cache_alloc`	无	dcache (LRU)	路径解析 ~ 回收
file	内核内存	`kmem_cache_alloc`	无	无缓存	open() ~ close()

结语

Linux VFS 通过精巧的分层设计和缓存机制，实现了统一、高效的文件系统抽象。理解 VFS 的建立和使用过程，对于深入理解 Linux 内核、开发文件系统驱动、优化 I/O 性能以及理解容器技术都具有重要意义。

这套设计使得 Linux 文件系统既统一又高效，是 Linux 高性能 I/O 的基石！