Linux：进程替换和知识整合

文章目录

进程程序替换
- 替换原理
- 进程替换的理解
环境变量与进程替换
命令行解释器
- 实现逻辑

进程程序替换

前面已经学习了子进程的创建，但是子进程的创建不管怎么说，都是父进程代码的一部分，那么实际上如果想要子进程执行新的程序呢？

也就是说，执行全新的代码和访问全新的数据，不再和父进程有瓜葛呢？这个时候就引入了关于进程替换的概念

替换原理

用fork创建子进程后执行的是和父进程相同的程序(但有可能执行不同的代码分支),子进程往往要调用一种exec函数以执行另一个程序。当进程调用一种exec函数时,该进程的用户空间代码和数据完全被新程序替换,从新程序的启动例程开始执行。调用exec并不创建新进程,所以调用exec前后该进程的id并未改变

进程替换的理解

首先演示基本用法：

单进程下的用法

c 复制代码

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>

int main()
{
    printf("execl begin:\n");
    execl("/usr/bin/ls", "ls", "-a", "-l", "-n", NULL);
    printf("execl end:\n");
    return 0;
}

调用结果：

shell 复制代码

[test@VM-16-11-centos 11-8]$ ./myprocess 
execl begin:
total 28
drwxrwxr-x  2 1003 1003 4096 Nov  9 10:50 .
drwxrwxrwt 16 1003 1003 4096 Nov  8 20:40 ..
-rw-rw-r--  1 1003 1003   74 Nov  8 20:41 Makefile
-rwxrwxr-x  1 1003 1003 8416 Nov  9 10:50 myprocess
-rw-rw-r--  1 1003 1003  175 Nov  9 10:47 myprocess.c

从中看出，它的基本原理就是在进程中进行进程的替换

为什么最后输出的printf不被调用呢？

这是因为，执行到进程替换函数的时候，如果成功，整个进程的代码和数据都会被替换为所需替换的目标代码和数据，这样在后续执行的时候都会使用这份新的代码和数据，因此不会调用后续出现的代码

多进程版本的程序替换

将上述的代码更改为含有子进程的代码，具体如下：

c 复制代码

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/wait.h>

int main()
{
    pid_t id = fork();
    if(id == 0)
    {
        // child
        printf("pid:%d,begin to exec!\n",getpid());
        sleep(3);
        execl("/usr/bin/ls","ls","-a","-l",NULL);
        printf("pid:%d,end to exec!\n",getpid());
    }
    else 
    {
        // father
        printf("wait child\n");
        pid_t rid = waitpid(-1,NULL,0);
        if(rid > 0)
        {
            printf("wait success\n");
        }
    
    }
    return 0;
}

实验结果如下：

shell 复制代码

[test@VM-16-11-centos 11-8]$ ./myprocess 
wait child
pid:18212,begin to exec!
total 28
drwxrwxr-x  2 test test 4096 Nov  9 11:05 .
drwxrwxrwt 16 test test 4096 Nov  8 20:40 ..
-rw-rw-r--  1 test test   74 Nov  8 20:41 Makefile
-rwxrwxr-x  1 test test 8672 Nov  9 11:05 myprocess
-rw-rw-r--  1 test test  695 Nov  9 11:05 myprocess.c
wait success

从中看出多进程替换中增加了父进程对子进程的等待和回收的部分功能

那在多进程下应该如何理解进程替换呢？用下面图示的过程来演示：

从这里的进程替换中可以发掘出一些东西，替换的是进程，而不是代码，所以这里可以替换的内容有很多，甚至可以是Java写的程序运行起来的进程等等，看下面的实验

下面实现一个cpp程序

cpp 复制代码

#include <iostream>

int main()
{
    std::cout<<"this is a cpp program"<<std::endl;
    return 0;
}

对前面的程序进行修改

c 复制代码

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/wait.h>

int main()
{
    pid_t id = fork();
    if(id == 0)
    {
        // child
        printf("pid:%d,begin to exec!\n",getpid());
        sleep(3);
        execl("./cpptest","./cpptest",NULL);
        //execl("/usr/bin/ls","ls","-a","-l",NULL);
        printf("pid:%d,end to exec!\n",getpid());
    }
    else 
    {
        // father
        printf("wait child\n");
        pid_t rid = waitpid(-1,NULL,0);
        if(rid > 0)
        {
            printf("wait success\n");
        }
    
    }
    return 0;
}

对Makefile进行修改

shell 复制代码

.PHONY:all
all:myprocess cpptest

cpptest:cpptest.cc 
	g++ -o $@ $^

myprocess:myprocess.c
	gcc -o $@ $^
.PHONY:clean
clean:
	rm -rf myprocess cpptest

这里利用的是Makefile自带的自我推演能力，使用Makefile进行自我推演可以推演出，现在需要myprocess和cpptest，而这两个程序又会分别进行执行运行

此时进行运行，此时会做出如下的实验结果：

shell 复制代码

[test@VM-16-11-centos 11-8]$ ./myprocess 
wait child
pid:23071,begin to exec!
this is a cpp program
wait success

从中不难看出，确实实现了进程的替换，而且替换的还是其余进程

这也就解释了在不同的公司中是可以存在分块进行构建模块功能的，最后都可以通过进程的形式链接起来

从某种意义来说，进程的替换已经可以被看成是一种系统调用了，站在系统的视角看内存中的所谓进程，实际上是一样的，系统高于一切，它可以对进程进行调度和分配

环境变量与进程替换

当进行进程替换的过程中，对于环境变量的角度来讲，是以什么样的情况进行的传递呢？

结论是：子进程对应的环境变量，是可以直接从父进程来的

对这个结论进行验证：

有关进程替换的一些函数

execl函数，需要找到命令所在的文件目录，使用方法如下：

cpp 复制代码

#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>


int main()
{
    pid_t id = fork();
    if(id == 0)
    {
        // child
        // 进行进程替换
        execl("/usr/bin/ls", "ls", "-a", "-l", "-d", NULL);
    }
    else 
    {
        // parent
        // 对子进程回收
        pid_t rid = waitpid(-1, NULL, 0);
        if(rid > 0)
        {
            printf("wait success\n");
        }
    }
    return 0;
}

execlp函数：会到系统默认的路径下寻找命令

cpp 复制代码

execlp("ls", "ls", "-a", "-l", "-d", NULL);

execle函数：用一个程序调用另外一个程序，但环境变量是自己的环境变量，不是系统的，通过获取环境变量查看

如何在进程中添加一个环境变量？用到的是putenv函数：

void *putenv(char *name)

由此可以写出下面的程序

cpp 复制代码

#include <iostream>

int main(int argc, char* argv[], char* env[])
{
    // 输出命令行参数
    for(int i = 0; argv[i]; i++)
    {
        std::cout << i << "->" << argv[i] << std::endl;
    }
    std::cout << "##############" << std::endl;
    
    // 输出环境变量
    for(int i = 0; env[i]; i++)
    {
        std::cout << i << "->" << env[i] << std::endl;
    }
    return 0;
}

上面是用于进程替换的函数，在这个基础上，对原程序进行修改

cpp 复制代码

#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>

int main()
{
    // 在程序中新增环境变量
    char* myenv = "MYVAL1 = 11111111";
    putenv(myenv);
    pid_t id = fork();
    if(id == 0)
    {
        // child
        // 进行进程替换
        execl("./myprocess", "myprocess", NULL);
    }
    else 
    {
        // parent
        // 对子进程回收
        pid_t rid = waitpid(-1, NULL, 0);
        if(rid > 0)
        {
            printf("wait success\n");
        }
    }
}

运行程序如下：

从中看出，在子进程中是出现了新增的这个环境变量的，由此可以基本验证，在父进程中添加的环境变量会继承到子进程中

那么父进程的父进程是谁呢？答案是bash，那么是不是在bash中添加的环境变量也会继承到子进程中？

再对上面的程序进行修改

cpp 复制代码

#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>

int main(int argc, char* argv[], char* env[])
{
    // 输出环境变量
    for(int i = 0; env[i]; i++)
    {
        printf("%d -> %s\n", i, env[i]);
    }
    // 在程序中新增环境变量
    char* myenv = {
        "MYVAL1 = 11111111",
        "MYVAL2 = 22222222",
        NULL
    };
    putenv(myenv);
    pid_t id = fork();
    if(id == 0)
    {
        // child
        // 进行进程替换
        execl("./mytest", "mytest", NULL);
    }
    else 
    {
        // parent
        // 对子进程回收
        pid_t rid = waitpid(-1, NULL, 0);
        if(rid > 0)
        {
            printf("wait success\n");
        }
    }
}

由此可以得出这样的一条线索化的示意图：

再次回到这张图

下面看execle函数

环境变量的传递方式？

前面的例子证明，子进程的环境变量是由父进程传递的，而execle函数就是一个显示传递环境变量的函数，它的第三个参数是envp[]，实际上就是环境变量

那如何进行使用？看下面的程序

c 复制代码

#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>

int main(int argc, char* argv[], char* env[])
{
    // 在程序中新增环境变量
    char* const myenv[] = {
        "MYVAL1 = 11111111",
        "MYVAL2 = 22222222",
        NULL
    };
    pid_t id = fork();
    if(id == 0)
    {
        // child
        // 进行进程替换
        execle("./mytest", "mytest", NULL, myenv);
    }
    else 
    {
        // parent
        // 对子进程回收
        pid_t rid = waitpid(-1, NULL, 0);
        if(rid > 0)
        {
            printf("wait success\n");
        }
    }
    return 0;
}

运行结果如下：

从中看出，通过这个函数可以把环境变量进行显示传递给子进程，并且是一种覆盖式传递

到此，有关进程替换的基本逻辑已经结束，那进程替换可以做什么实际的东西呢？

命令行解释器

在前面的认知中，命令行解释器，也就是bash，可以把用户在命令行中敲的命令转换成命令再输出，而实际上，这是一个逻辑很简单的过程：

bash程序相当于是一个一直在后台运行的程序，而当用户敲了一些命令行后，bash创建子进程，就将这些命令行转换为一个字符串数组，采用进程替换的方式就可以把要找的命令和选项替换到前台，那依据这个原理，其实我们自己也能实现一个命令行解释器：

实现逻辑

c 复制代码

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>

#define NUM 1024
#define SIZE 64
#define SEP " "

char cwd[1024];
char enval[1024];
int lastcode = 0;

const char *getUsername()
{
    const char *name = getenv("USER");
    if(name) return name;
    else return "none";
}

const char *getHostname()
{
    const char *hostname = getenv("HOSTNAME");
    if(hostname) return hostname;
    else return "none";
}

const char *getCwd()
{
    const char *cwd = getenv("PWD");
    if(cwd) return cwd;
    else return "none";
}

int getUserCommand(char *command, int num)
{
    printf("[%s@%s %s]# ", getUsername(), getHostname(), getCwd());
    char *r = fgets(command, num, stdin);
    if(r == NULL) return -1;
    command[strlen(command) - 1] = '\0';
    return strlen(command);
}

void commandSplit(char *in, char *out[])
{
    int argc = 0;
    out[argc++] = strtok(in, SEP);
    while(out[argc++] = strtok(NULL, SEP));
}

int execute(char *argv[])
{
    pid_t id = fork();
    if(id < 0) 
    {
        return -1;
    }
    else if(id == 0)
    {
        execvp(argv[0], argv);
        exit(1);
    }
    else
    {
        int status = 0;
        pid_t rid = waitpid(id, &status, 0);
        if(rid > 0)
        {
            lastcode = WEXITSTATUS(status);
        }
    }
    return 0;
}

void cd(const char *path)
{
    chdir(path);
    char tmp[1024];
    getcwd(tmp, sizeof(tmp));
    sprintf(cwd, "PWD=%s", tmp);
    putenv(cwd);
}

int doBuildin(char *argv[])
{
    if(strcmp(argv[0], "cd") == 0)
    {
        char *path = NULL;
        if(argv[1] == NULL) path = ".";
        else path = argv[1];
        cd(path);
        return 1;
    }
    else if(strcmp(argv[0], "export") == 0)
    {
        if(argv[1] == NULL) return 1;
        strcpy(enval, argv[1]);
        putenv(enval); // ???
        return 1;
    }
    else if(strcmp(argv[0], "echo") == 0)
    {
        char *val = argv[1] + 1;
        if(strcmp(val, "?") == 0)
        {
            printf("%d\n", lastcode);
            lastcode = 0;
        }
        else
        {
            printf("%s\n", getenv(val));
        }
        return 1;
    }
    else if(0)
    {}

    return 0;
}

int main()
{
    while(1)
    {
        char usercommand[NUM];
        char *argv[SIZE];
        // 1. 打印提示符&&获取用户命令字符串获取成功
        int n = getUserCommand(usercommand, sizeof(usercommand));
        if(n <= 0) continue;
        // 2. 分割字符串
        // "ls -a -l" -> "ls" "-a" "-l"
        commandSplit(usercommand, argv);
        // 3. check build-in command
        n = doBuildin(argv);
        if(n) continue;
        // 4. 执行对应的命令
        execute(argv);
    }
}