Python中的多线程和多进程应用模拟以及协程程序中使用多线程和多进程

关于Python的多线程和多进程适合在什么情况下使用

现阶段版本下的Python由于全局解释器锁GIL的存在，Python无法实现真正的并行运算，GIL会限制同一时刻同一个进程中最多只能有一个线程在执行字节码，以确保字节码的执行是线程安全的

因此正常情况下同一个进程中最多只有一个线程在使用CPU进行运算，这限制了Python利用多核CPU的资源，当然可以使用多进程运行来利用多核CPU的优势实现真正的并发运算

但这是否意味着Python多线程完全没有适用范围？

接下来的内容给出了简短的模拟示例，以说明什么时候使用多进程合适、什么时候使用多线程比较合适，以及尝试看同一个进程中多线程能够并行计算

CPU密集型任务的模拟测试

对于CPU密集型任务的情况下，由于CPU一直处于繁忙的状态，此时如果能够使用多进程利用多核CPU来并发运算，势必会成倍地增长运算效率

py 复制代码

# -*- coding: utf-8 -*-
"""
演示CPU密集型任务使用多线程效率高还是多进程效率高
"""
import time

from multiprocessing import Process
from threading import Thread


def cpu_intensive_task(n):
    """模拟遇到CPU密集型任务时，CPU忙于计算的情况"""
    c = 0
    for i in range(10000000):
        c += 1
    print(n)


if __name__ == '__main__':
    # Conclusion: 对于循环次数足够大的CPU任务，多进程相对于多线程效率是更高的
    #  如果不是很耗时的CPU计算任务（比如把c+=1的for循环次数由1000万改成100万），可能会出现相反的情况
    s = time.time()
    tasks = []
    for _ in range(10):
        t = Thread(target=cpu_intensive_task, args=(_, ))       # 1.947516679763794
        # t = Process(target=cpu_intensive_task, args=(_, ))        # 0.7475171089172363
        t.start()
        tasks.append(t)
    # 注意要放到这里才开始join，否则task只能按顺序执行，结果不会准确
    for task in tasks:
        task.join()
    print(time.time() - s)

从结果可以很明显地看出来，使用多进程执行10次的CPU密集运算，实现了真正的并行计算，因此耗时由2s左右到0.7秒左右显著提高

IO密集型任务的模拟测试

对于IO密集型任务，由于等待IO执行的情况下，CPU比较空闲，使用多进程和多线程并发运行时，理论上耗时应该差不多，

但是由于窗口进程相对于线程来说，需要额外的内存资源，并且同一进程下线程共享资源，因此线程切换时相对进程切换处理时需要的消耗更小，故而对于IO密集型任务，使用多线程会更合适

py 复制代码

# -*- coding: utf-8 -*-
"""
测试IO密集型任务使用多线程效率高还是多进程效率高
"""
import time

from multiprocessing import Process
from threading import Thread


def io_bound_task(n):
    """模拟遇到IO任务时，等待IO执行，CPU闲置的情况"""
    time.sleep(1)


if __name__ == '__main__':
    # summary: 这次的测试可以发现对于IO密集型任务，使用多线程减少资源消耗和切换等，可以显著地提升效率
    s = time.time()
    tasks = []
    # （Windows系统）这里开启的个数太多，会报错（例如100，多进程运行就会报错ImportError: DLL load failed while importing select: 页面文件太小，无法完成操作。
    # 或者Memory Error类似的错误）
    # 因此这里for循环的次数需要根据机器硬件情况选择一个合适的数值
    for _ in range(15):
        t = Thread(target=io_bound_task, args=(_, ))       # 1.0163295269012451
        # t = Process(target=io_bound_task, args=(_, ))        # 2.1188271045684814
        t.start()
        tasks.append(t)

    for task in tasks:
        task.join()
    print(time.time() - s)

依然可以从验证结果的耗时看出来，执行15次的IO密集型任务并发运行时，多线程效率比多进程效率更高，因此对于IO密集型任务，多线程的效率会更高

从协程程序中使用多线程的情况，来观察同一个进程下是否真的无法实现多线程的并行执行？

前面已经说过，由于GIL的存在，同一个进程下的多个线程无法实现真正的并行，同一时刻只能有一个线程在运行

也就是说，下面这种情况，使用异步的协程实现的并发，会因为CPU密集运算而导致程序阻塞，没有实现真正的并行运算：

py 复制代码

# -*- coding: utf-8 -*-
"""
一个1s左右的CPU密集运算函数，和1个1s异步sleep的函数，如果受到GIL的限制，其运行时间应该是2s左右
"""
import asyncio
import time

loop = asyncio.get_event_loop()


async def coro_task():
    """1s的异步sleep函数"""
    print("----------------coro_task start-----------------")
    await asyncio.sleep(1)
    print("----------------coro_task end-----------------")
    return "coro_task"


async def cpu_bound_task():     # 这个函数虽然是协程，但是里面的是cpu密集型计算，没有await
    """运行耗时1s左右的CPU密集运算函数"""
    print("----------------cpu_bound_task start-----------------")
    c = 0
    # 注：下面这段cpu密集计算过程，在当前电脑上耗时大概1秒左右，需要根据实际机器的情况调整循环次数以方便观察
    for i in range(5):
        for j in range(10000000):
            c += j
    print("----------------cpu_bound_task end-----------------")
    return "cpu_bound_task"


async def run_parallelly():
    s = time.time()
    for res in asyncio.as_completed([coro_task(), cpu_bound_task()]):
        await res

    # # 使用gather并发运行，得到结果也会是一样的
    # results = await asyncio.gather(cpu_bound_task(), coro_task())
    print(f"Time taken in seconds: {time.time() - s}")


if __name__ == '__main__':
    loop.run_until_complete(run_parallelly())

运行结果：

py 复制代码

----------------cpu_bound_task start-----------------
----------------cpu_bound_task end-----------------
----------------coro_task start-----------------
----------------coro_task end-----------------
Time taken in seconds: 2.150818109512329

从结果来看，一个1s左右的CPU密集运算函数（cpu_bound_task），和1个1s异步sleep的函数（coro_task），受到了GIL的限制，其运行时间是2s左右，很合理

但是当使用loop.run_in_executor运行cpu密集运算的函数cpu_bound_task（改成同步函数）时，会发现cpu_bound_task和coro_task似乎实现了真正的并行运行？：

py 复制代码

# -*- coding: utf-8 -*-
"""
关于loop.run_in_executor(None, cpu_bound_task)运行CPU密集计算函数时为什么不会阻塞事件循环（的主线程）

从结果可以看出来loop.run_in_executor(None, cpu_bound_task)开启了新的线程，并且似乎绕过了GIL开启了真正的并行和coro_task同时计算？
"""
import asyncio
import os
import threading
import time

loop = asyncio.get_event_loop()


def get_running_info(task_name):
    """打印当前运行的信息，方便对比"""
    try:
        current_loop = asyncio.get_event_loop()
    except RuntimeError:
        current_loop = None

    current_thread = threading.current_thread().name
    current_process_id = os.getpid()
    print(f"{task_name=}, {current_loop=}, {current_process_id=}, {current_thread=}")
    return current_loop


async def coro_task():
    """1s的异步sleep函数"""
    print("----------------coro_task start-----------------")
    await asyncio.sleep(1)
    current_loop = get_running_info("coro_task")
    print("----------------coro_task end-----------------")
    return "coro_task"


def cpu_bound_task():
    """运行耗时1s左右的CPU密集运算函数"""
    print("----------------cpu_bound_task start-----------------")
    c = 0
    flag = False
    for i in range(5):
        for j in range(10000000):
            c += j
        if not flag:
            current_loop = get_running_info("cpu_bound_task")
            flag = True
    print("----------------cpu_bound_task end-----------------")
    return "cpu_bound_task"


async def run_parallelly():
    s = time.time()
    # loop.run_in_executor使用默认的线程池中的executor
    executor = None

    cpu_task = loop.run_in_executor(executor, cpu_bound_task)
    for res in asyncio.as_completed([coro_task(), cpu_task]):      
        await res

    # 使用gather并发运行，得到结果也会是一样的
    # results = await asyncio.gather(cpu_task, coro_task())
    print(f"Time taken in seconds: {time.time() - s}")     


if __name__ == '__main__':
    loop.run_until_complete(run_parallelly())

运行结果:

py 复制代码

----------------cpu_bound_task start-----------------
----------------coro_task start-----------------
task_name='cpu_bound_task', current_loop=None, current_process_id=37776, current_thread='asyncio_0'
task_name='coro_task', current_loop=<ProactorEventLoop running=True closed=False debug=False>, current_process_id=37776, current_thread='MainThread'
----------------coro_task end-----------------
----------------cpu_bound_task end-----------------
Time taken in seconds: 1.184269905090332

根据运行结果来看，总耗时为两任务中的最大的耗时1点多秒，coro_task和cpu_bound_task好像实现了真正的并发运算？

loop.run_in_executor(executor, cpu_bound_task)开启了新的线程，这个线程和事件循环中的主线程似乎同时运算，并没有受到GIL存在的限制？之前建立的准则难道还有特例？

同一个进程中的两个线程到底有没有并行的计算？

实际上，真相往往藏在比表面更更更深的一层：

假如把coro_task的内容改成和cpu_bound_task的内容一样，都执行CPU运算，然后查看结果：

py 复制代码

# -*- coding: utf-8 -*-
import asyncio
import dis
import os
import threading
import time

loop = asyncio.get_event_loop()


def get_running_info(task_name):
    """打印当前运行的信息，方便对比"""
    try:
        current_loop = asyncio.get_event_loop()
    except RuntimeError:
        current_loop = None

    current_thread = threading.current_thread().name
    current_process_id = os.getpid()
    print(f"{task_name=}, {current_loop=}, {current_process_id=}, {current_thread=}")
    return current_loop


# async def coro_task():
#     """1s的异步sleep函数"""
#     print("----------------coro_task start-----------------")
#     await asyncio.sleep(1)
#     current_loop = get_running_info("coro_task")
#     print("----------------coro_task end-----------------")
#     return "coro_task"


async def coro_task():
    """运行耗时1s左右的CPU密集运算函数"""
    print("----------------coro_task start-----------------")
    c = 0
    flag = False
    for i in range(5):
        for j in range(10000000):
            c += j
        if not flag:
            # current_loop = get_running_info("cpu_bound_task")
            flag = True
    current_loop = get_running_info("cpu_bound_task")
    print("----------------coro_task end-----------------")
    return "cpu_bound_task"


def cpu_bound_task():
    """运行耗时1s左右的CPU密集运算函数"""
    print("----------------cpu_bound_task start-----------------")
    c = 0
    flag = False
    for i in range(5):
        for j in range(10000000):
            c += j
        if not flag:
            # current_loop = get_running_info("cpu_bound_task")
            flag = True
    current_loop = get_running_info("cpu_bound_task")
    print("----------------cpu_bound_task end-----------------")
    return "cpu_bound_task"


async def run_parallelly():
    s = time.time()
    # loop.run_in_executor使用默认的线程池中的executor
    executor = None

    # 选择使用多进程的executor时，便可以发现，这下才真正地可以实现并行运算两个任务
    # from concurrent.futures import ProcessPoolExecutor
    # executor = ProcessPoolExecutor()

    cpu_task = loop.run_in_executor(executor, cpu_bound_task)
    for res in asyncio.as_completed([coro_task(), cpu_task]):       
        await res

    # 使用gather并发运行，得到结果也会是一样的
    # results = await asyncio.gather(cpu_task, coro_task())
    print(f"Time taken in seconds: {time.time() - s}")      


if __name__ == '__main__':
    loop.run_until_complete(run_parallelly())
    # print(dis.dis(coro_task))

运行结果：

py 复制代码

----------------cpu_bound_task start-----------------
----------------coro_task start-----------------
task_name='cpu_bound_task', current_loop=None, current_process_id=29112, current_thread='asyncio_0'
----------------cpu_bound_task end-----------------
task_name='cpu_bound_task', current_loop=<ProactorEventLoop running=True closed=False debug=False>, current_process_id=29112, current_thread='MainThread'
----------------coro_task end-----------------
Time taken in seconds: 2.181851625442505

可以看到耗时变成了2秒左右

那么前面的coro_task中await asyncio.sleep(1)的时候为什么表现出了并行运算的结果？

很容易解释：
await asyncio.sleep(1)并不需要持续占用这CPU，因此await asyncio.sleep(1)任务只是挂起等待1s，

此时CPU切换到cpu_bound_task继续执行，

1s后再回到coro_task，coro_task执行完之后又回到cpu_bound_task继续计数计算，

CPU来回切换的工作方式只是看起来让我们以为是任务在并行执行

当把这段程序中的这两行

py 复制代码

    # from concurrent.futures import ProcessPoolExecutor
    # executor = ProcessPoolExecutor()

取消注释之后，使用多进程运行，会发现此时结果耗时是1秒左右，到这里两个任务运行时，才算是真正的绕开了GIL的限制，实现了真正的并行运算

py 复制代码

----------------coro_task start-----------------
----------------cpu_bound_task start-----------------
task_name='cpu_bound_task', current_loop=<ProactorEventLoop running=True closed=False debug=False>, current_process_id=18296, current_thread='MainThread'
----------------coro_task end-----------------
task_name='cpu_bound_task', current_loop=<ProactorEventLoop running=False closed=False debug=False>, current_process_id=4012, current_thread='MainThread'
----------------cpu_bound_task end-----------------
Time taken in seconds: 1.3650383949279785