9.1 多线程入门

文章目录

前言
[一、 Python 解释器与 GIL](#一、 Python 解释器与 GIL)
- [1.1 Python 解释器](#1.1 Python 解释器)
- [1.2 GIL（全局解释器锁）](#1.2 GIL（全局解释器锁）)
二、threading模块
- [2.1 threading模块方法](#2.1 threading模块方法)
- [2.2 线程对象（Thread）](#2.2 线程对象（Thread）)
- [2.3 锁对象（Lock/RLock）](#2.3 锁对象（Lock/RLock）)
- [2.4 条件变量（Condition）](#2.4 条件变量（Condition）)
- [2.5 信号量（Semaphore）](#2.5 信号量（Semaphore）)
- [2.6 事件（Event）](#2.6 事件（Event）)
- [2.7 定时器（Timer）](#2.7 定时器（Timer）)
[三、线程池（ThreadPoolExecutor）](#三、线程池（ThreadPoolExecutor）)
[四、线程同步最佳实践](#四、线程同步最佳实践)
[五、注意事项](#五、注意事项)

前言

本文主要介绍threading模块的相关知识以及线程池等知识点。

一、 Python 解释器与 GIL

1.1 Python 解释器

Python 解释器负责将.py文件中的代码转换为机器可执行的指令。常见的解释器包括：

CPython：官方解释器，使用C语言开发，是最广泛使用的Python解释器
Jython：由Java编写，可将Python代码编译为Java字节码，在JVM上运行
IronPython：由C#编写，运行在.NET平台上
IPython：基于CPython的交互式解释器，增强了交互体验
PyPy：采用JIT（即时编译）技术，执行速度通常快于CPython

1.2 GIL（全局解释器锁）

GIL（Global Interpreter Lock）是CPython解释器的线程同步机制，确保同一时刻只有一个线程执行Python字节码。

特点：

简化了CPython的内存管理，避免了并发访问的线程安全问题
牺牲了多核处理器的并行计算能力
CPython下的多线程在CPU密集型任务中无法实现真正的并行

GIL存在的历史原因：

早期简单有效地解决了线程安全问题
大量第三方库依赖GIL特性
虽然理论上可以移除，但涉及大量底层代码修改，工程难度大

二、threading模块

2.1 threading模块方法

python 复制代码

python
import threading

# 获取当前所有活动线程
threads = threading.enumerate()

# 获取活动线程数量
count = threading.active_count()

# 获取当前线程
current = threading.current_thread()

# 获取线程标识符
ident = threading.get_ident()

# 获取主线程
main = threading.main_thread()

# 获取/设置线程堆栈大小
size = threading.stack_size()
threading.stack_size(32768)  # 设置为32KB

# 获取原生线程ID
native_id = threading.get_native_id()

# 最大超时时间常量
MAX_TIMEOUT = threading.TIMEOUT_MAX

2.2 线程对象（Thread）

创建线程的两种方式：

方式一：实例化Thread类

python 复制代码

python
import threading
import time

def worker(sleep_time, task_name):
    time.sleep(sleep_time)
    print(f'{task_name} 执行完成 - 线程: {threading.current_thread().name}')

# 创建线程
t1 = threading.Thread(target=worker, args=(1, '任务1'), name='Worker-1')
t2 = threading.Thread(target=worker, args=(2, '任务2'), name='Worker-2')

# 启动线程
t1.start()
t2.start()

# 等待线程结束
t1.join()
t2.join()
print('所有任务完成')

方式二：继承Thread类

python 复制代码

python
class MyThread(threading.Thread):
    def __init__(self, sleep_time, name):
        super().__init__()
        self.sleep_time = sleep_time
        self.task_name = name
        
    def run(self):
        time.sleep(self.sleep_time)
        print(f'{self.task_name} 执行完成 - 线程: {self.name}')
        return f'{self.task_name}_result'

# 使用自定义线程类
t1 = MyThread(1, '自定义任务1')
t2 = MyThread(2, '自定义任务2')
t1.start()
t2.start()
t1.join()
t2.join()

线程属性与方法：

python 复制代码

python
# 创建线程
thread = threading.Thread(target=lambda: time.sleep(1), name='DemoThread')

# 线程控制
thread.start()      # 启动线程
thread.join(0.5)    # 等待线程0.5秒
thread.is_alive()   # 检查线程是否存活

# 线程属性访问
thread.name         # 线程名称
thread.ident        # 线程标识符
thread.daemon       # 是否为守护线程
thread.native_id    # 系统原生线程ID

# 守护线程设置
thread.daemon = True  # 设置为守护线程
thread.isDaemon()     # 检查是否为守护线程

2.3 锁对象（Lock/RLock）

线程安全问题示例：

python 复制代码

python
import threading

counter = 0
def unsafe_increment():
    global counter
    for _ in range(100000):
        counter += 1

# 多线程执行会导致结果不正确
threads = []
for _ in range(10):
    t = threading.Thread(target=unsafe_increment)
    threads.append(t)
    t.start()

for t in threads:
    t.join()

print(f'预期结果: 1000000, 实际结果: {counter}')

使用Lock解决线程安全问题：

python 复制代码

python
import threading

counter = 0
lock = threading.Lock()

def safe_increment():
    global counter
    for _ in range(100000):
        lock.acquire()      # 获取锁
        try:
            counter += 1
        finally:
            lock.release()  # 释放锁

# 或使用with语句自动管理锁
def safe_increment_with():
    global counter
    for _ in range(100000):
        with lock:          # 自动获取和释放锁
            counter += 1

threads = []
for _ in range(10):
    t = threading.Thread(target=safe_increment_with)
    threads.append(t)
    t.start()

for t in threads:
    t.join()

print(f'预期结果: 1000000, 实际结果: {counter}')
RLock（可重入锁）示例：

python
import threading

rlock = threading.RLock()

def recursive_func(n):
    with rlock:
        if n > 0:
            print(f'获取锁，n={n}')
            recursive_func(n-1)

# RLock允许同一线程多次获取锁
thread = threading.Thread(target=recursive_func, args=(3,))
thread.start()
thread.join()

2.4 条件变量（Condition）

python 复制代码

python
import threading
import time

# 生产者-消费者模型
class SharedBuffer:
    def __init__(self, capacity):
        self.buffer = []
        self.capacity = capacity
        self.condition = threading.Condition()
    
    def produce(self, item):
        with self.condition:
            # 等待缓冲区有空间
            while len(self.buffer) >= self.capacity:
                self.condition.wait()
            
            self.buffer.append(item)
            print(f'生产: {item}, 缓冲区大小: {len(self.buffer)}')
            self.condition.notify_all()  # 通知消费者
    
    def consume(self):
        with self.condition:
            # 等待缓冲区有数据
            while len(self.buffer) == 0:
                self.condition.wait()
            
            item = self.buffer.pop(0)
            print(f'消费: {item}, 缓冲区大小: {len(self.buffer)}')
            self.condition.notify_all()  # 通知生产者
            return item

# 测试生产者消费者
buffer = SharedBuffer(5)

def producer():
    for i in range(10):
        buffer.produce(f'产品{i}')
        time.sleep(0.1)

def consumer():
    for _ in range(10):
        item = buffer.consume()
        time.sleep(0.2)

# 启动生产者和消费者线程
p = threading.Thread(target=producer)
c = threading.Thread(target=consumer)
p.start()
c.start()
p.join()
c.join()

2.5 信号量（Semaphore）

python 复制代码

python
import threading
import time
import random

# 限制同时访问资源的线程数量
semaphore = threading.Semaphore(3)  # 最多允许3个线程同时访问

def access_resource(thread_id):
    with semaphore:
        print(f'线程 {thread_id} 开始访问资源')
        time.sleep(random.uniform(1, 3))
        print(f'线程 {thread_id} 结束访问资源')

# 模拟10个线程访问受限资源
threads = []
for i in range(10):
    t = threading.Thread(target=access_resource, args=(i,))
    threads.append(t)
    t.start()

for t in threads:
    t.join()

2.6 事件（Event）

python 复制代码

python
import threading
import time

# 事件用于线程间通信
event = threading.Event()

def waiter():
    print('等待者: 等待事件触发...')
    event.wait()  # 阻塞直到事件被设置
    print('等待者: 事件已触发，开始工作!')

def setter():
    print('设置者: 正在处理准备工作...')
    time.sleep(3)
    print('设置者: 准备工作完成，触发事件')
    event.set()  # 设置事件，唤醒所有等待的线程

# 创建并启动线程
w1 = threading.Thread(target=waiter, name='等待者1')
w2 = threading.Thread(target=waiter, name='等待者2')
s = threading.Thread(target=setter, name='设置者')

w1.start()
w2.start()
time.sleep(1)  # 确保等待者先开始等待
s.start()

w1.join()
w2.join()
s.join()

2.7 定时器（Timer）

python 复制代码

python
import threading

def delayed_task():
    print('定时任务执行!')

# 创建定时器，5秒后执行
timer = threading.Timer(5.0, delayed_task)
print('定时器已启动，5秒后执行')
timer.start()

# 可以取消定时器
# timer.cancel()

三、线程池（ThreadPoolExecutor）

虽然threading模块本身不提供线程池，但concurrent.futures模块提供了更高级的线程池接口：

python 复制代码

python
from concurrent.futures import ThreadPoolExecutor
import time

def task(name, duration):
    print(f'任务 {name} 开始')
    time.sleep(duration)
    print(f'任务 {name} 完成')
    return f'{name}_结果'

# 使用线程池
with ThreadPoolExecutor(max_workers=3) as executor:
    # 提交任务
    futures = [
        executor.submit(task, f'任务{i}', i) 
        for i in range(1, 6)
    ]
    
    # 获取结果
    for future in futures:
        result = future.result()
        print(f'收到结果: {result}')

四、线程同步最佳实践

避免使用全局变量：尽量使用参数传递或线程安全的数据结构
使用with语句管理锁：确保锁总能被正确释放
避免死锁：按固定顺序获取锁，使用超时机制

合理使用线程局部存储：

python 复制代码

python
import threading

# 线程局部数据
local_data = threading.local()

def show_local_data():
    print(f'线程 {threading.current_thread().name}: {local_data.value}')

def worker(value):
    local_data.value = value
    show_local_data()

threads = []
for i in range(3):
    t = threading.Thread(target=worker, args=(f'数据{i}',))
    threads.append(t)
    t.start()

for t in threads:
    t.join()

五、注意事项

GIL限制：CPU密集型任务考虑使用多进程（multiprocessing模块）

IO密集型任务：多线程在IO操作中仍能提高效率

线程安全数据结构：优先使用queue.Queue等线程安全的数据结构

异常处理：线程中的异常不会传递到主线程，需在线程内部处理

资源清理：确保线程正确结束，避免资源泄漏

9.1 多线程入门

文章目录

前言

一、 Python 解释器与 GIL

1.1 Python 解释器

1.2 GIL（全局解释器锁）

二、threading模块

2.1 threading模块方法

2.2 线程对象（Thread）

2.3 锁对象（Lock/RLock）

2.4 条件变量（Condition）

2.5 信号量（Semaphore）

2.6 事件（Event）

2.7 定时器（Timer）

三、 线程池（ThreadPoolExecutor）

四、 线程同步最佳实践

五、 注意事项

三、线程池（ThreadPoolExecutor）

四、线程同步最佳实践

五、注意事项