Linux进程池与管道通信详解：从原理到实现

1. 引言

在Linux系统编程中，进程池是一种重要的并发编程模型，它通过预先创建多个子进程来处理任务，避免了频繁创建和销毁进程的开销。结合管道进行进程间通信，可以构建高效的任务处理系统。本文将详细分析一个完整的进程池实现，涵盖各个模块的设计原理和易错点。

2. 核心代码结构

2.1 任务函数定义

cpp 复制代码

/////////////////子进程要完成的任务/////////////////////
void SyncDisk()
{
    std::cout << getpid() << "刷新数据到磁盘任务" << std::endl;
    sleep(1);
}

void DownLoad()
{
    std::cout << getpid() << "下载数据到系统中" << std::endl;
    sleep(1);
}

void PrintLog()
{
    std::cout << getpid() <<"打印日志任务" << std::endl;
    sleep(1);
}

void UpdateStatus()
{
    std::cout << getpid() <<"更新一次用户的状态" << std::endl;
    sleep(1);
}

typedef void (*task_t)(); //函数指针
task_t tasks[4] = {SyncDisk, DownLoad, PrintLog, UpdateStatus};

代码解释：

定义了4个具体的任务函数，每个函数模拟不同的工作场景
使用task_t类型别名定义函数指针，便于统一管理
tasks数组将任务函数组织起来，通过索引即可调用相应任务

2.2 子进程任务处理

cpp 复制代码

/////////////////进程池相关/////////////////////

// 子进程的入口函数
void DoTask(int fd)
{
    while(true)
    {
        int task_code = 0;
        ssize_t n = read(fd, &task_code, sizeof(task_code));
        if (n == sizeof(task_code))
        {
            if (task_code >= 0 && task_code < 4)
            {
                tasks[task_code](); //执行任务表中的任务
            }
        }
        else if(n == 0)
        {
            //父进程关闭了写端，子进程读到0表示管道读端关闭
            std::cout << getpid() << "task quit ..." << std::endl;
            break;
        }
        else
        {
            perror("read error");
            break;
        }
    }
}

关键点分析：

子进程通过阻塞读取管道来等待任务
read返回值处理：
- 读取到完整任务码：执行对应任务
- 返回0：父进程关闭管道，子进程退出
- 返回-1：读取错误，退出循环

3. 进程池核心设计

3.1 Channel类 - 进程通信通道

cpp 复制代码

class Channel
{
public:
    Channel(int wfd, pid_t pid): _wfd(wfd), _sub_pid(pid)
    {
        _sub_name = "sub-process-" + std::to_string(_sub_pid);
    }
    
    void Write(int index)
    {
        ssize_t n = write(_wfd, &index, sizeof(index));
        (void)n;
    }
    
    void ClosePipe()
    {
        std::cout << "关闭管道，通知子进程：" << _wfd << " 退出..." << std::endl;
        close(_wfd);
    }
    
    void Wait()
    {
        pid_t rid = waitpid(_sub_pid, nullptr, 0);
        (void)rid;
        std::cout << "回收子进程：" << _sub_pid << " 成功..." << std::endl;
    }
    
private:
    int _wfd;
    pid_t _sub_pid; //子进程是谁
    std::string _sub_name;// 子进程名字
};

设计要点：

封装了管道写端文件描述符和子进程PID
提供统一的写操作、关闭管道和等待子进程接口
每个Channel对象代表一个父进程与子进程的通信通道

3.2 ProcessPool类 - 进程池管理

cpp 复制代码

class ProcessPool
{
private:
    std::vector<Channel> channels;
    
public:
    void Init(cb_t cb)
    {
        CreateProcessChannel(cb);
    }
    
    void Run()
    {
        int cnt = 10;
        while(cnt--)
        {
            // 任务调度逻辑
            int index = SelectChannel();
            int itask = SelectTask();
            SendTask2Slaver(itask, index);
        }
    }
    
    void Quit()
    {
        for(auto &channel: channels)
        {
            channel.ClosePipe();
            channel.Wait();
        }
    }
};

4. 关键实现细节

4.1 进程创建与管道建立

cpp 复制代码

void CreateProcessChannel(cb_t cb)
{
    for(int i = 0; i < gprocessnum; i++)
    {
        int pipefd[2] = {0};
        int n = pipe(pipefd);
        if (n < 0)
        {
            std::cerr << "pip create error" << std::endl;
            exit(PIPE_ERROR);
        }
        
        pid_t id = fork();
        if (id < 0)
        {
            std::cerr << "fork error" << std::endl;
            exit(FORK_ERROR);
        }
        else if (id == 0)
        {
            // 子进程代码
            if(!channels.empty())
            {
                for(auto &channel: channels)
                {
                    channel.ClosePipe();
                }
            }
            close(pipefd[1]);
            cb(pipefd[0]);
            exit(OK);
        }
        else
        {
            // 父进程代码
            close(pipefd[0]);
            channels.emplace_back(pipefd[1],id);
            std::cout << "创建子进程：" << id << "成功..." << std::endl;
            sleep(1);
        }
    }
}

重要细节：

管道创建：每个子进程对应一个独立的管道
文件描述符继承：子进程会继承父进程之前创建的所有管道写端
资源清理：子进程需要关闭不需要的文件描述符

4.2 任务调度策略

cpp 复制代码

int SelectChannel()
{
    static int index = 0;
    int selected = index;
    index++;
    index %= channels.size();
    return selected;
}

int SelectTask()
{
    int itask = rand() % 4;
    return itask;
}

调度策略：

轮询方式选择子进程
随机选择任务类型
实际应用中可根据负载情况实现更复杂的调度算法

5. 易错点与解决方案

5.1 文件描述符泄漏问题

问题描述：

子进程继承了父进程创建的所有管道写端，导致文件描述符泄漏。

从右侧的示意图可以清晰地看到进程池的创建过程与文件描述符的继承关系：

创建第一个子进程时：
- 父进程调用 pipe 创建了一对管道，假设获得文件描述符 fd[0]=3（读端）和 fd[1]=4（写端）。
- 父进程通过 fork 创建子进程1。子进程会继承父进程当前所有的文件描述符，因此它也拥有这对管道（子进程的 fd[0]=3, fd[1]=4）。
- 资源分配 ：父进程关闭它不需要的读端(3)，保留写端(4)用于向子进程1发送任务。子进程1关闭它不需要的写端(4)，保留读端(3)用于接收任务。至此，父进程的写端 4 与子进程1的读端 3 构成了第一个通信通道。
创建第二个子进程时：
- 父进程再次调用 pipe，创建第二对管道，假设获得新的文件描述符 fd[0]=5（读端）和 fd[1]=6（写端）。
- 父进程再次 fork 创建子进程2。关键点在于 ：由于子进程是通过复制父进程的地址空间创建的，它会继承父进程在创建它那一刻 的所有文件描述符。这意味着子进程2不仅拥有新创建的管道(5, 6)，还继承了第一对管道的文件描述符(3, 4)。
- 资源分配 ：父进程关闭第二对管道的读端(5)，保留新写端(6)。子进程2则需要清理：它关闭从父进程继承来的、属于第一个通道的写端(4)，并关闭第二对管道中它不需要的写端(6)，最终只保留第二对管道的读端(5)用于接收任务。这样，父进程的写端 6 与子进程2的读端 5 构成了第二个通信通道。

后续子进程的创建过程以此类推。每个子进程在创建时都会继承父进程之前打开的所有管道。所以如果直接写下边代码，问题在于父进程虽然关闭写端，但对于管道来说，还有子进程对其有写权限，因此子进程不会被释放。

cpp 复制代码

// version2--bug
for(auto &channel: channels)
{
    channel.ClosePipe();
    channel.Wait();
}

解决方案：

在创建子进程后关闭子进程的立式管道的写端

cpp 复制代码

// 子进程中关闭历史管道的写端
if(!channels.empty())
{
    for(auto &channel: channels)
    {
        channel.ClosePipe();
    }
}

有了上述操作后下边的操作才正确

cpp 复制代码

// version2
for(auto &channel: channels)
{
    channel.ClosePipe();
    channel.Wait();
}

其他解决方案：

cpp 复制代码

// 方案1：先关闭所有管道，再统一等待
for(auto &channel: channels)
{
    channel.ClosePipe();
}
for(auto &channel: channels)
{
    channel.Wait();
}

// 方案2：倒序关闭和等待
int end = channels.size() - 1;
while (end >= 0)
{
    channels[end].ClosePipe();
    channels[end].Wait();
    end--;
}

5.3 管道读写同步

关键点：

读写数据大小必须一致（4字节整型）
需要处理管道关闭和读取错误的情况
确保任务码在有效范围内

6. 完整测试流程

cpp 复制代码

int main()
{
    // 1 初始化进程池
    ProcessPool pp;
    pp.Init(DoTask);
    pp.Debug();

    //2 父进程控制子进程
    pp.Run();
    
    //3 释放和回收所有资源
    pp.Quit();
    
    return 0;
}

7. 总结

本文详细分析了一个基于管道通信的Linux进程池实现，涵盖了：

任务管理：通过函数指针数组统一管理任务
进程通信：使用管道进行父子进程间通信
资源管理：正确处理文件描述符的打开和关闭
进程同步：合理的进程创建和退出机制

8. 完整代码

cpp 复制代码

#include<iostream>
#include<string>
#include<unistd.h>
#include<functional>
#include<vector>
#include<stdlib.h>
#include<ctime>
#include<sys/wait.h>

/////////////////子进程要完成的任务/////////////////////
void SyncDisk()
{
    std::cout << getpid() << "刷新数据到磁盘任务" << std::endl;
    sleep(1);
}

void DownLoad()
{
    std::cout << getpid() << "下载数据到系统中" << std::endl;
    sleep(1);
}

void PrintLog()
{
    std::cout << getpid() <<"打印日志任务" << std::endl;
    sleep(1);
}

void UpdateStatus()
{
    std::cout << getpid() <<"更新一次用户的状态" << std::endl;
    sleep(1);
}

typedef void (*task_t)(); //函数指针

task_t tasks[4] = {SyncDisk, DownLoad, PrintLog, UpdateStatus};

/////////////////进程池相关/////////////////////

// 子进程的入口函数
void DoTask(int fd)
{
    while(true)
    {
        int task_code = 0;
        ssize_t n = read(fd, &task_code, sizeof(task_code));
        if (n == sizeof(task_code))
        {
            if (task_code >= 0 && task_code < 4)
            {
                tasks[task_code](); //执行任务表中的任务
            }
        }
        else if(n == 0)
        {
            //父进程关闭了写端，子进程读到0表示管道读端关闭
            std::cout << getpid() << "task quit ..." << std::endl;
            break;
        }
        else
        {
            perror("read error");
            break;
        }
    }
}

const int gprocessnum = 5; //全局进程个数

using cb_t =  std::function<void (int)>;

enum
{
    OK = 0,
    PIPE_ERROR,
    FORK_ERROR,
};

class ProcessPool
{
private:
    class Channel
    {
        public:
            Channel(int wfd, pid_t pid): _wfd(wfd), _sub_pid(pid)
            {
                _sub_name = "sub-process-" + std::to_string(_sub_pid);
            }
            void PrintInfo()
            {
                printf("wfd: %d, who: %d,channel name:%s\n",_wfd,_sub_pid, _sub_name.c_str());
            }
            void Write(int index)
            {
                ssize_t n = write(_wfd, &index, sizeof(index)); // 约定的4字节发送
                (void)n;
            }
            std::string Name()
            {
                return _sub_name;
            }
            void ClosePipe()
            {
                std::cout << "关闭管道，通知子进程：" << _wfd << " 退出..." << std::endl;
                close(_wfd);
            }
            void Wait()
            {
                pid_t rid = waitpid(_sub_pid, nullptr, 0);
                (void)rid;
                std::cout << "回收子进程：" << _sub_pid << " 成功..." << std::endl;
            }
            ~Channel()
            {}
        private:
            int _wfd;
            pid_t _sub_pid; //子进程是谁
            std::string _sub_name;// 子进程名字
            // int cnt; //当前子进程处理的任务数
    };

public:
    ProcessPool()
    {
        srand((unsigned int)time(NULL) ^ getpid());
    }
    ~ProcessPool(){}

    void Init(cb_t cb)
    {
        CreateProcessChannel(cb);
    }

    void Debug()
    {
        for(auto &c: channels)
        {
            c.PrintInfo();
        }
    }

    void Quit()
    {
    // version1
        // //1 让所有子进程退出
        // for(auto &channel: channels)
        // {
        //     channel.ClosePipe();
        // }

        // //2 回收子进程
        // for(auto &channel: channels)
        // {
        //     channel.Wait();
        // }

    // version2--bug版
        // 不能这样写
        // 因为依次创建子进程时，子进程2会继承父进程对先前写的管道功能（因为是fork出来的），父进程关了写端，但子进程2还持有对该管道的写端
        // for(auto &channel: channels)
        // {
        //     channel.ClosePipe();
        //     channel.Wait();
        // }
        // 发现规律：父进程写端4,5,6... ；子进程读端3，写端依次为3；4；4,5；4,5,6；...

    // 解决方案：
    //1 倒着关子进程：最后一个子进程，wfd只有1个
        // int end = channels.size() - 1;
        // while (end >= 0)
        // {
        //     channels[end].ClosePipe();
        //     channels[end].Wait();
        //     end--;
        // }

    //2 我们想要真正的1:1=r:w，就用之前的代码
        for(auto &channel: channels)
        {
            channel.ClosePipe();
            channel.Wait();
        }

    }

    void Run()
    {
        int cnt = 10;
        while(cnt--)
        {
            std::cout << "------------------父进程调度任务------------------" << std::endl;
            //1 选择一个channel(管道+子进程），本质是选择一个下标数字
            int index = SelectChannel();
            std::cout << "who indx:" << index << std::endl;
            // sleep(1);

            //2 选择一个任务
            int itask = SelectTask();
            std::cout << "itask:" << itask << std::endl;
            // sleep(1);

            //3 发送一个任务给指定的channel（管道+子进程）
            printf("发送任务 %d 给 %s \n", itask, channels[index].Name().c_str());
            SendTask2Slaver(itask, index);
            // sleep(1);

        }
        
    }

private:
    void SendTask2Slaver(int task, int index)
    {
        if(index < 0 || index >= channels.size())
            return;
        if (task < 4 && task >= 0)
        {
            channels[index].Write(task);
        }
        
    }

    int SelectChannel()
    {
        static int index = 0;
        int selected = index;
        index++;
        index %= channels.size();
        return selected;
    }

    int SelectTask()
    {
        int itask = rand() % 4;
        return itask;
    }

    void CreateProcessChannel(cb_t cb)
    {
        for(int i = 0; i < gprocessnum; i++) //只有父进程会执行循环
        {
            int pipefd[2] = {0};
            int n = pipe(pipefd);
            if (n < 0)
            {
                std::cerr << "pip create error" << std::endl;
                exit(PIPE_ERROR);
            }
            pid_t id = fork();
            if (id < 0)
            {
                std::cerr << "fork error" << std::endl;
                exit(FORK_ERROR);
            }
            else if (id == 0)
            {
                // 关闭历史wfd，影响的是自己的fd表
                if(!channels.empty())
                {
                    for(auto &channel: channels)
                    {
                        channel.ClosePipe();
                    }
                }

                close(pipefd[1]);
                cb(pipefd[0]);
                exit(OK); //根本不会执行后续代码，执行完自己的DoTask()函数后，自己就退出了
            }
            else
            {
                close(pipefd[0]);
                channels.emplace_back(pipefd[1],id);
                // Channel ch(pipefd[1],id);
                // channels.push_back(ch);
                std::cout << "创建子进程：" << id << "成功..." << std::endl;
                sleep(1);
            }
            
        }
    }

private:
    std::vector<Channel> channels;

};


int main()
{
    // 1 初始化进程池
    ProcessPool pp;
    pp.Init(DoTask);
    pp.Debug();

    //2 父进程控制子进程
    pp.Run();
    
    //3 释放和回收所有资源（释放管道回收子进程）
    pp.Quit();
    sleep(1000);
    
    return 0;
}