Python语言基础文档

1. Python语言简介和环境搭建

1.1 Python语言简介

Python是一种解释型、面向对象、动态数据类型的高级程序设计语言。Python由荷兰人Guido van Rossum于1989年圣诞节期间发明，并于1991年首次公开发行。

1.1.1 Python的特点

简单易学：Python的语法设计简洁明了，是编程入门的理想选择
解释型语言：无需编译，可以直接运行
面向对象：支持面向对象编程范式
动态类型：变量不需要声明类型
跨平台：可在Windows、macOS、Linux等多种操作系统上运行
丰富的标准库：提供了大量内置模块和函数
强大的第三方库生态：如NumPy、Pandas、Django、TensorFlow等

1.1.2 Python的应用领域

Web开发：Django、Flask、FastAPI等框架
数据科学：数据分析、机器学习、人工智能
自动化脚本：系统管理、自动化测试
游戏开发：Pygame等库
GUI应用：Tkinter、PyQt、wxPython等
网络爬虫：Requests、BeautifulSoup、Scrapy等
科学计算：NumPy、SciPy、Matplotlib等

1.2 环境搭建

1.2.1 安装Python

Windows系统

访问Python官方网站
下载最新版本的Python安装包
运行安装程序，确保勾选"Add Python to PATH"选项
点击"Install Now"完成安装

安装完成后，可以通过命令行验证：

bash 复制代码

python --version  # 或 python -V

Linux系统

大多数Linux发行版已经预装了Python，但可能不是最新版本。可以使用以下命令安装或更新：

bash 复制代码

# Ubuntu/Debian
sudo apt update
sudo apt install python3 python3-pip

# CentOS/RHEL
sudo yum install python3 python3-pip

验证安装：

bash 复制代码

python3 --version

macOS系统

macOS系统自带Python 2.x版本，但建议安装Python 3.x：

使用Homebrew安装：
bash 复制代码
```
brew install python
```
或从Python官方网站下载安装包

验证安装：

bash 复制代码

python3 --version

1.2.2 安装pip

pip是Python的包管理器，用于安装和管理第三方库。Python 3.4及以上版本已默认包含pip。

验证pip安装：

bash 复制代码

# Windows
pip --version

# Linux/macOS
pip3 --version

升级pip：

bash 复制代码

# Windows
python -m pip install --upgrade pip

# Linux/macOS
pip3 install --upgrade pip

1.2.3 虚拟环境

为了隔离不同项目的依赖，推荐使用虚拟环境：

使用venv（Python 3.3+内置）

bash 复制代码

# 创建虚拟环境
# Windows
python -m venv myenv

# Linux/macOS
python3 -m venv myenv

# 激活虚拟环境
# Windows
myenv\Scripts\activate

# Linux/macOS
source myenv/bin/activate

# 退出虚拟环境
deactivate

使用virtualenv

bash 复制代码

# 安装virtualenv
pip install virtualenv

# 创建虚拟环境
virtualenv myenv

# 激活虚拟环境（同上）

# 退出虚拟环境
deactivate

使用conda（Anaconda/Miniconda）

如果使用Anaconda或Miniconda，创建虚拟环境的方法：

bash 复制代码

# 创建虚拟环境
conda create --name myenv python=3.9

# 激活虚拟环境
# Windows
conda activate myenv

# Linux/macOS
source activate myenv

# 退出虚拟环境
conda deactivate

1.3 开发工具

1.3.1 IDE推荐

PyCharm：专业的Python IDE，有社区版（免费）和专业版
Visual Studio Code：轻量级编辑器，配合Python扩展使用
Jupyter Notebook：交互式计算环境，适合数据分析和学习
Spyder：科学计算IDE，类似MATLAB界面
Sublime Text：轻量级编辑器，配合插件使用

1.3.2 安装常用库

在虚拟环境中，使用pip安装常用库：

bash 复制代码

# 安装数据科学相关库
pip install numpy pandas matplotlib scipy

# 安装Web开发相关库
pip install django flask requests

# 安装机器学习相关库
pip install scikit-learn tensorflow

1.4 第一个Python程序

创建一个简单的Python程序：

创建一个名为hello.py的文件
写入以下代码：

python 复制代码

print("Hello, Python!")

在命令行中运行：

bash 复制代码

# Windows
python hello.py

# Linux/macOS
python3 hello.py

输出结果：

复制代码

Hello, Python!

1.5 Python解释器

Python提供了交互式解释器，可以直接在命令行中执行Python代码：

bash 复制代码

# 启动Python解释器
# Windows
python

# Linux/macOS
python3

# 在解释器中执行代码
>>> print("Hello, Python!")
Hello, Python!
>>> 2 + 3
5
>>> exit()  # 退出解释器

1.6 常见问题解决

1.6.1 PATH环境变量问题

如果系统找不到Python命令，可能是因为Python没有添加到PATH环境变量中。可以手动添加或重新安装Python并勾选"Add Python to PATH"选项。

1.6.2 虚拟环境相关问题

激活虚拟环境失败 ：
- Windows：检查PowerShell执行策略，可能需要以管理员身份运行
- Linux/macOS：检查文件权限，确保脚本可执行

1.6.3 pip安装失败

网络问题：可以尝试使用国内镜像源

bash 复制代码

pip install -i https://pypi.tuna.tsinghua.edu.cn/simple package_name

权限问题：Windows可能需要以管理员身份运行命令行，Linux/macOS可以使用sudo

1.6.4 版本兼容性问题

确保使用的第三方库版本与Python版本兼容。可以在库的官方文档中查看版本兼容性信息。

2. Python基础语法

2.1 基本语法规则

2.1.1 缩进

Python使用缩进来表示代码块，而不是大括号{}。通常使用4个空格进行缩进：

python 复制代码

if 5 > 2:
    print("Five is greater than two!")
    print("This is also inside the if block")
print("This is outside the if block")

缩进必须保持一致，否则会导致IndentationError错误。

2.1.2 注释

Python中的注释以#开头：

python 复制代码

# 这是单行注释
print("Hello, Python!")  # 行内注释

"""
这是多行注释
可以包含多行文本
通常用于函数或类的文档字符串
"""

2.1.3 行与语句

Python通常一行一个语句，但可以使用分号将多个语句放在同一行：

python 复制代码

x = 5; y = 10; z = x + y

使用反斜杠可以将一个语句分成多行：

python 复制代码

long_string = "这是一个非常长的字符串，" \
              "使用反斜杠来换行"

在括号、方括号或花括号内的表达式可以直接换行，不需要反斜杠：

python 复制代码

numbers = [1, 2, 3, 4,
           5, 6, 7]

2.2 变量和数据类型

2.2.1 变量命名规则

变量名只能包含字母、数字和下划线
变量名不能以数字开头
变量名区分大小写
避免使用Python关键字和内置函数名作为变量名

python 复制代码

# 有效的变量名
my_var = 10
user_name = "Python"
_count = 5

# 无效的变量名
2var = 10  # 不能以数字开头
user-name = "Python"  # 不能包含连字符
class = 5  # 不能使用关键字

2.2.2 变量赋值

Python是动态类型语言，变量不需要预先声明类型：

python 复制代码

# 基本赋值
x = 5
name = "Python"
is_active = True

# 多重赋值
x, y, z = 1, 2, 3

# 变量交换
x, y = y, x

# 增量赋值
x = 5
x += 1  # 相当于 x = x + 1
x -= 2  # 相当于 x = x - 2
x *= 3  # 相当于 x = x * 3
x /= 2  # 相当于 x = x / 2

2.3 运算符

2.3.1 算术运算符

python 复制代码

x = 10
y = 3

print(x + y)  # 加法: 13
print(x - y)  # 减法: 7
print(x * y)  # 乘法: 30
print(x / y)  # 除法: 3.3333333333333335
print(x // y)  # 整除: 3
print(x % y)  # 取余: 1
print(x ** y)  # 幂运算: 1000

2.3.2 比较运算符

python 复制代码

x = 5
y = 3

print(x == y)  # 等于: False
print(x != y)  # 不等于: True
print(x > y)  # 大于: True
print(x < y)  # 小于: False
print(x >= y)  # 大于等于: True
print(x <= y)  # 小于等于: False

2.3.3 逻辑运算符

python 复制代码

x = True
y = False

print(x and y)  # 逻辑与: False
print(x or y)  # 逻辑或: True
print(not x)  # 逻辑非: False

2.3.4 身份运算符

python 复制代码

x = 5
y = 5
z = [1, 2, 3]
w = [1, 2, 3]

print(x is y)  # 是同一个对象: True
print(z is w)  # 是同一个对象: False (列表是可变对象)
print(x is not y)  # 不是同一个对象: False

2.3.5 成员运算符

python 复制代码

fruits = ["apple", "banana", "cherry"]

print("apple" in fruits)  # 在列表中: True
print("orange" not in fruits)  # 不在列表中: True

2.4 控制流

2.4.1 if语句

python 复制代码

x = 10

if x > 5:
    print("x is greater than 5")
elif x == 5:
    print("x is equal to 5")
else:
    print("x is less than 5")

# 嵌套if
if x > 0:
    if x < 10:
        print("x is between 0 and 10")
    else:
        print("x is greater than or equal to 10")

# 条件表达式（三元运算符）
result = "Positive" if x > 0 else "Non-positive"
print(result)

2.4.2 for循环

python 复制代码

# 遍历列表
fruits = ["apple", "banana", "cherry"]
for fruit in fruits:
    print(fruit)

# 使用range()函数
for i in range(5):  # 0, 1, 2, 3, 4
    print(i)

for i in range(2, 6):  # 2, 3, 4, 5
    print(i)

for i in range(0, 10, 2):  # 0, 2, 4, 6, 8
    print(i)

# 遍历字典
person = {"name": "John", "age": 30, "city": "New York"}
for key in person:
    print(key, ":", person[key])

for key, value in person.items():
    print(key, ":", value)

# 遍历字符串
for char in "Python":
    print(char)

2.4.3 while循环

python 复制代码

# 基本while循环
count = 0
while count < 5:
    print(count)
    count += 1

# while-else循环
count = 0
while count < 5:
    print(count)
    count += 1
else:
    print("Loop completed normally")

2.4.4 break和continue语句

python 复制代码

# break语句 - 跳出循环
for i in range(10):
    if i == 5:
        break
    print(i)  # 只打印0-4

# continue语句 - 跳过当前迭代，继续下一次
for i in range(10):
    if i % 2 == 0:
        continue
    print(i)  # 只打印奇数: 1, 3, 5, 7, 9

2.5 列表推导式和生成器表达式

2.5.1 列表推导式

列表推导式提供了一种简洁的方式来创建列表：

python 复制代码

# 基本列表推导式
squares = [x**2 for x in range(10)]  # [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

# 带条件的列表推导式
even_squares = [x**2 for x in range(10) if x % 2 == 0]  # [0, 4, 16, 36, 64]

# 嵌套列表推导式
matrix = [[i*j for j in range(1, 4)] for i in range(1, 4)]
# [[1, 2, 3], [2, 4, 6], [3, 6, 9]]

# 多个for循环的列表推导式
pairs = [(x, y) for x in [1, 2, 3] for y in [3, 1, 4] if x != y]
# [(1, 3), (1, 4), (2, 3), (2, 1), (2, 4), (3, 1), (3, 4)]

2.5.2 生成器表达式

生成器表达式与列表推导式类似，但使用圆括号而不是方括号。生成器表达式返回一个生成器对象，而不是立即创建整个列表，因此更节省内存：

python 复制代码

squares_gen = (x**2 for x in range(10))
print(squares_gen)  # <generator object <genexpr> at 0x...>

# 迭代生成器
for square in squares_gen:
    print(square)

# 使用生成器表达式计算大数据集
import sys
big_list = [x for x in range(1000000)]
big_gen = (x for x in range(1000000))
print(sys.getsizeof(big_list))  # 大内存占用
print(sys.getsizeof(big_gen))  # 小内存占用

2.6 基本输入输出

2.6.1 输出

python 复制代码

# 基本输出
print("Hello, World!")

# 输出多个值，用空格分隔
name = "Python"
version = 3.9
print("Name:", name, "Version:", version)

# 使用格式化字符串
print(f"Name: {name}, Version: {version}")  # f-string (Python 3.6+)
print("Name: {}, Version: {}".format(name, version))  # format方法
print("Name: %s, Version: %f" % (name, version))  # 旧式格式化

# 输出到文件
with open("output.txt", "w") as f:
    print("Hello, File!", file=f)

2.6.2 输入

python 复制代码

# 基本输入
name = input("Enter your name: ")
print(f"Hello, {name}!")

# 输入数字
age = int(input("Enter your age: "))
print(f"You will be {age + 1} next year.")

# 输入多个值
x, y = map(int, input("Enter two numbers separated by space: ").split())
print(f"Sum: {x + y}")

2.7 关键字和内置函数

2.7.1 Python关键字

Python有一些保留关键字，不能用作变量名：

复制代码

False    class    finally  is       return
None     continue for      lambda   try
True     def      from     nonlocal while
and      del      global   not      with
as       elif     if       or       yield
assert   else     import   pass
break    except   in       raise

2.7.2 常用内置函数

Python提供了许多内置函数，可以直接使用：

python 复制代码

# 数学函数
print(abs(-5))  # 绝对值: 5
print(max(1, 2, 3, 4))  # 最大值: 4
print(min(1, 2, 3, 4))  # 最小值: 1
print(sum([1, 2, 3, 4]))  # 求和: 10
print(round(3.14159, 2))  # 四舍五入: 3.14

# 类型转换函数
print(int("123"))  # 字符串转整数: 123
print(float("3.14"))  # 字符串转浮点数: 3.14
print(str(123))  # 整数转字符串: "123"
print(list((1, 2, 3)))  # 元组转列表: [1, 2, 3]
print(tuple([1, 2, 3]))  # 列表转元组: (1, 2, 3)

# 其他常用函数
print(len([1, 2, 3, 4]))  # 获取长度: 4
print(sorted([3, 1, 4, 1, 5, 9]))  # 排序: [1, 1, 3, 4, 5, 9]
print(type(42))  # 获取类型: <class 'int'>
print(isinstance(42, int))  # 检查类型: True
print(range(5))  # 创建范围对象: range(0, 5)
print(list(range(5)))  # 转换为列表: [0, 1, 2, 3, 4]

## 3. Python数据类型和内置结构

### 3.1 基本数据类型

#### 3.1.1 数值类型

Python支持三种数值类型：整数（int）、浮点数（float）和复数（complex）。

```python
# 整数
x = 10
print(type(x))  # <class 'int'>

# 浮点数
y = 3.14
print(type(y))  # <class 'float'>

# 复数
z = 2 + 3j
print(type(z))  # <class 'complex'>
print(z.real)  # 实部: 2.0
print(z.imag)  # 虚部: 3.0

Python的整数可以处理任意大的整数，不会发生溢出：

python 复制代码

large_num = 10**100  # 1后面跟100个零
print(large_num)  # 可以正常显示

3.1.2 字符串（str）

字符串是由字符组成的不可变序列，可以使用单引号、双引号或三引号定义：

python 复制代码

# 基本字符串
name = "Python"
message = 'Hello, World!'

# 多行字符串
long_text = """
这是一个多行字符串
可以包含多个
换行符
"""

# 字符串拼接
greeting = "Hello, " + name  # "Hello, Python"

# 字符串重复
separator = "-" * 10  # "----------"

# 字符串索引和切片
print(name[0])  # 第一个字符: 'P'
print(name[-1])  # 最后一个字符: 'n'
print(name[1:4])  # 切片: 'yth'
print(name[:3])  # 从开头到索引3: 'Pyt'
print(name[3:])  # 从索引3到结尾: 'hon'

# 字符串是不可变的
try:
    name[0] = 'p'
    # 会引发 TypeError: 'str' object does not support item assignment
except TypeError as e:
    print(e)

字符串常用方法：

python 复制代码

s = "Hello, Python!"

# 大小写转换
print(s.upper())  # "HELLO, PYTHON!"
print(s.lower())  # "hello, python!"
print(s.title())  # "Hello, Python!"
print(s.capitalize())  # "Hello, python!"

# 查找和替换
print(s.find("Python"))  # 查找子字符串，返回索引: 7
print(s.replace("Python", "World"))  # 替换子字符串: "Hello, World!"

# 分割和连接
words = s.split(", ")  # 分割: ['Hello', 'Python!']
print(", ".join(words))  # 连接: "Hello, Python!"

# 去除空白
text = "  Hello World  "
print(text.strip())  # 去除两端空白: "Hello World"
print(text.lstrip())  # 去除左端空白: "Hello World  "
print(text.rstrip())  # 去除右端空白: "  Hello World"

# 判断字符串内容
print(s.startswith("Hello"))  # 是否以...开头: True
print(s.endswith("!"))  # 是否以...结尾: True
print(s.isalpha())  # 是否全为字母: False
print("123".isdigit())  # 是否全为数字: True
print("Python123".isalnum())  # 是否全为字母或数字: False（包含逗号和感叹号）

3.1.3 布尔值（bool）

布尔值只有两个值：True和False，通常用于条件判断：

python 复制代码

x = True
y = False
print(type(x))  # <class 'bool'>

# 布尔运算
print(x and y)  # False
print(x or y)  # True
print(not x)  # False

# 数字转换为布尔值
print(bool(0))  # False
print(bool(1))  # True
print(bool(10))  # True

# 空值转换为布尔值
print(bool(""))  # False
print(bool([]))  # False
print(bool({}))  # False
print(bool(None))  # False

# 非空值转换为布尔值
print(bool("Hello"))  # True
print(bool([1, 2, 3]))  # True

3.1.4 None

None表示空值或不存在的值，是Python的特殊常量：

python 复制代码

x = None
print(type(x))  # <class 'NoneType'>
print(x is None)  # True

# None常用于作为默认参数值
def greet(name=None):
    if name is None:
        return "Hello, World!"
    return f"Hello, {name}!"

3.2 列表（List）

列表是Python中最常用的可变序列类型，可以包含不同类型的元素。

3.2.1 创建列表

python 复制代码

# 创建空列表
empty_list = []
empty_list = list()

# 创建包含元素的列表
numbers = [1, 2, 3, 4, 5]
mixed = [1, "Python", 3.14, True, None]

# 使用列表推导式
quares = [x**2 for x in range(10)]

# 将其他可迭代对象转换为列表
list_from_string = list("Python")  # ['P', 'y', 't', 'h', 'o', 'n']
list_from_range = list(range(5))  # [0, 1, 2, 3, 4]

3.2.2 列表索引和切片

与字符串类似，列表也支持索引和切片操作：

python 复制代码

fruits = ["apple", "banana", "cherry", "date"]

# 索引
print(fruits[0])  # "apple"
print(fruits[-1])  # "date"

# 切片
print(fruits[1:3])  # ["banana", "cherry"]
print(fruits[:2])  # ["apple", "banana"]
print(fruits[2:])  # ["cherry", "date"]
print(fruits[::2])  # ["apple", "cherry"]  # 步长为2
print(fruits[::-1])  # ["date", "cherry", "banana", "apple"]  # 反转列表

3.2.3 列表方法

python 复制代码

# 添加元素
fruits = ["apple", "banana"]
fruits.append("cherry")  # 添加到末尾: ["apple", "banana", "cherry"]
fruits.extend(["date", "elderberry"])  # 扩展列表: ["apple", "banana", "cherry", "date", "elderberry"]
fruits.insert(1, "apricot")  # 在指定位置插入: ["apple", "apricot", "banana", "cherry", "date", "elderberry"]

# 删除元素
fruits.remove("banana")  # 删除指定值: ["apple", "apricot", "cherry", "date", "elderberry"]
popped = fruits.pop()  # 删除并返回最后一个元素
popped = fruits.pop(0)  # 删除并返回指定索引的元素
del fruits[1]  # 删除指定索引的元素
fruits.clear()  # 清空列表: []

# 查找和排序
fruits = ["apple", "banana", "cherry"]
print(fruits.index("banana"))  # 查找索引: 1
print(fruits.count("apple"))  # 计算出现次数: 1

fruits.sort()  # 原地排序: ["apple", "banana", "cherry"]
fruits.sort(reverse=True)  # 降序排序: ["cherry", "banana", "apple"]

# 创建排序后的副本
sorted_fruits = sorted(fruits)

# 反转列表
fruits.reverse()  # 原地反转
reversed_fruits = fruits[::-1]  # 创建反转副本

3.3 元组（Tuple）

元组是不可变的序列类型，与列表类似但不能修改。

3.3.1 创建元组

python 复制代码

# 创建元组
empty_tuple = ()
empty_tuple = tuple()
single_element_tuple = (1,)
colors = ("red", "green", "blue")
colors = "red", "green", "blue"  # 可以省略括号

# 将其他可迭代对象转换为元组
tuple_from_string = tuple("Python")  # ('P', 'y', 't', 'h', 'o', 'n')
tuple_from_list = tuple([1, 2, 3])  # (1, 2, 3)

3.3.2 元组操作

由于元组是不可变的，所以不支持添加、删除或修改元素的操作。但可以进行索引、切片、拼接等操作：

python 复制代码

colors = ("red", "green", "blue")

# 索引和切片
print(colors[0])  # "red"
print(colors[1:3])  # ("green", "blue")

# 拼接元组
new_colors = colors + ("yellow", "purple")  # ("red", "green", "blue", "yellow", "purple")

# 元组重复
doubled_colors = colors * 2  # ("red", "green", "blue", "red", "green", "blue")

# 解包元组
r, g, b = colors
print(r, g, b)  # "red green blue"

# 忽略某些值
first, _, last = (1, 2, 3)
print(first, last)  # 1 3

# 收集剩余元素
first, *rest = (1, 2, 3, 4, 5)
print(first, rest)  # 1 [2, 3, 4, 5]

3.3.3 元组方法

元组只有少量方法，因为它们是不可变的：

python 复制代码

colors = ("red", "green", "blue", "red")

print(colors.index("green"))  # 查找索引: 1
print(colors.count("red"))  # 计算出现次数: 2

3.4 字典（Dictionary）

字典是Python中的映射类型，存储键值对（key-value pairs）。

3.4.1 创建字典

python 复制代码

# 创建空字典
empty_dict = {}
empty_dict = dict()

# 创建字典
person = {"name": "John", "age": 30, "city": "New York"}
person = dict(name="John", age=30, city="New York")

# 从键值对列表创建
items = [("name", "John"), ("age", 30)]
person = dict(items)

# 使用字典推导式
squares = {x: x**2 for x in range(5)}  # {0: 0, 1: 1, 2: 4, 3: 9, 4: 16}

3.4.2 字典操作

python 复制代码

person = {"name": "John", "age": 30, "city": "New York"}

# 访问值
print(person["name"])  # "John"
print(person.get("age"))  # 30
print(person.get("country", "Unknown"))  # 使用默认值: "Unknown"

# 修改和添加
person["age"] = 31  # 修改现有键
person["country"] = "USA"  # 添加新键值对

# 删除
person.pop("city")  # 删除指定键，返回值
person.popitem()  # 删除并返回最后一个键值对
del person["age"]  # 删除指定键
person.clear()  # 清空字典

3.4.3 字典遍历和方法

python 复制代码

person = {"name": "John", "age": 30, "city": "New York"}

# 遍历键
for key in person:
    print(key)

# 遍历值
for value in person.values():
    print(value)

# 遍历键值对
for key, value in person.items():
    print(key, ":", value)

# 字典方法
print(person.keys())  # 所有键: dict_keys(['name', 'age', 'city'])
print(person.values())  # 所有值: dict_values(['John', 30, 'New York'])
print(person.items())  # 所有键值对: dict_items([('name', 'John'), ('age', 30), ('city', 'New York')])

# 复制字典
person_copy = person.copy()
person_copy = dict(person)

# 更新字典
person.update({"age": 31, "country": "USA"})

3.5 集合（Set）

集合是无序的、不包含重复元素的数据结构。

3.5.1 创建集合

python 复制代码

# 创建空集合
empty_set = set()  # 注意不能使用 {}

# 创建集合
fruits = {"apple", "banana", "cherry"}
fruits = set(["apple", "banana", "cherry", "apple"])  # 自动去重: {'apple', 'banana', 'cherry'}

# 使用集合推导式
even_numbers = {x for x in range(10) if x % 2 == 0}  # {0, 2, 4, 6, 8}

3.5.2 集合操作

python 复制代码

fruits = {"apple", "banana", "cherry"}

# 添加元素
fruits.add("date")
fruits.update(["elderberry", "fig"])

# 删除元素
fruits.remove("banana")  # 如果元素不存在会引发KeyError
fruits.discard("grape")  # 如果元素不存在不会引发错误
popped = fruits.pop()  # 随机删除一个元素
fruits.clear()  # 清空集合

3.5.3 集合关系运算

python 复制代码

set1 = {1, 2, 3, 4}
set2 = {3, 4, 5, 6}

# 并集
print(set1 | set2)  # {1, 2, 3, 4, 5, 6}
print(set1.union(set2))

# 交集
print(set1 & set2)  # {3, 4}
print(set1.intersection(set2))

# 差集
print(set1 - set2)  # {1, 2}
print(set1.difference(set2))

# 对称差集（并集减去交集）
print(set1 ^ set2)  # {1, 2, 5, 6}
print(set1.symmetric_difference(set2))

# 子集和超集
set3 = {1, 2}
print(set3.issubset(set1))  # True
print(set1.issuperset(set3))  # True
print(set3 <= set1)  # True
print(set1 >= set3)  # True

# 不相交集合
set4 = {7, 8}
print(set1.isdisjoint(set4))  # True

3.6 冻结集合（Frozenset）

冻结集合是不可变的集合类型，一旦创建就不能修改。

python 复制代码

# 创建冻结集合
frozen_fruits = frozenset(["apple", "banana", "cherry"])

# 冻结集合不支持添加、删除等修改操作
# frozen_fruits.add("date")  # AttributeError

# 可以进行集合关系运算
another_set = {"banana", "cherry", "date"}
print(frozen_fruits & another_set)  # frozenset({'banana', 'cherry'})

3.7 数据类型转换

Python提供了多种内置函数用于在不同数据类型之间进行转换：

python 复制代码

# 转换为整数
print(int(3.14))  # 3
print(int("123"))  # 123
print(int("0xFF", 16))  # 255（十六进制）

# 转换为浮点数
print(float(10))  # 10.0
print(float("3.14"))  # 3.14

# 转换为字符串
print(str(123))  # "123"
print(str(3.14))  # "3.14"
print(str([1, 2, 3]))  # "[1, 2, 3]"

# 转换为列表
print(list("Python"))  # ['P', 'y', 't', 'h', 'o', 'n']
print(list((1, 2, 3)))  # [1, 2, 3]
print(list({"name": "John", "age": 30}))  # ['name', 'age']

# 转换为元组
print(tuple([1, 2, 3]))  # (1, 2, 3)
print(tuple("Python"))  # ('P', 'y', 't', 'h', 'o', 'n')

# 转换为集合
print(set([1, 2, 2, 3, 4]))  # {1, 2, 3, 4}
print(set("Python"))  # {'P', 'y', 't', 'h', 'o', 'n'}

# 转换为字典
print(dict([("name", "John"), ("age", 30)]))  # {'name': 'John', 'age': 30}
print(dict(zip(["name", "age"], ["John", 30])))  # {'name': 'John', 'age': 30}

3.8 类型检查

可以使用type()和isinstance()函数来检查变量的类型：

python 复制代码

x = 10
y = [1, 2, 3]

print(type(x))  # <class 'int'>
print(type(y))  # <class 'list'>

print(isinstance(x, int))  # True
print(isinstance(y, list))  # True

# isinstance()可以检查继承关系
class Animal:
    pass

class Dog(Animal):
    pass

my_dog = Dog()
print(isinstance(my_dog, Dog))  # True
print(isinstance(my_dog, Animal))  # True
print(type(my_dog) is Dog)  # True
print(type(my_dog) is Animal)  # False

## 4. Python函数和模块

### 4.1 函数定义和调用

函数是组织好的、可重用的代码块，用于执行特定任务。

#### 4.1.1 基本函数定义

```python
def greet():
    """这是一个简单的问候函数"""
    print("Hello, World!")

# 调用函数
greet()  # 输出: Hello, World!

# 访问函数文档字符串
print(greet.__doc__)  # 输出: 这是一个简单的问候函数

4.1.2 带参数的函数

python 复制代码

def greet(name):
    """带参数的问候函数"""
    print(f"Hello, {name}!")

greet("Python")  # 输出: Hello, Python!

4.2 函数参数类型

Python支持多种参数传递方式。

4.2.1 位置参数

位置参数是最基本的参数类型，必须按照定义的顺序传递。

python 复制代码

def add(a, b):
    return a + b

result = add(3, 5)  # 8

4.2.2 默认参数

默认参数允许在调用函数时省略某些参数，使用预定义的默认值。

python 复制代码

def greet(name, greeting="Hello"):
    return f"{greeting}, {name}!"

print(greet("Python"))  # Hello, Python!
print(greet("Python", "Hi"))  # Hi, Python!

注意：默认参数值应该是不可变对象，否则可能会产生意外行为：

python 复制代码

# 不推荐的做法
def add_item(item, items=[]):
    items.append(item)
    return items

list1 = add_item(1)
list2 = add_item(2)
print(list1)  # [1, 2]  # 意外行为！
print(list2)  # [1, 2]  # 意外行为！

# 推荐的做法
def add_item(item, items=None):
    if items is None:
        items = []
    items.append(item)
    return items

list1 = add_item(1)
list2 = add_item(2)
print(list1)  # [1]
print(list2)  # [2]

4.2.3 关键字参数

关键字参数允许使用参数名来指定参数值，不必按照定义的顺序传递。

python 复制代码

def describe_person(name, age, city):
    return f"{name} is {age} years old and lives in {city}."

# 使用关键字参数
print(describe_person(name="John", city="New York", age=30))
# 输出: John is 30 years old and lives in New York.

4.2.4 可变位置参数

使用*args可以接收任意数量的位置参数，这些参数会被打包成一个元组。

python 复制代码

def sum_all(*args):
    return sum(args)

print(sum_all(1, 2, 3))  # 6
print(sum_all(1, 2, 3, 4, 5))  # 15

4.2.5 可变关键字参数

使用**kwargs可以接收任意数量的关键字参数，这些参数会被打包成一个字典。

python 复制代码

def print_info(**kwargs):
    for key, value in kwargs.items():
        print(f"{key}: {value}")

print_info(name="John", age=30, city="New York")
# 输出:
# name: John
# age: 30
# city: New York

4.2.6 参数组合

在定义函数时，可以组合使用不同类型的参数，但必须按照一定的顺序：位置参数 → 默认参数 → *args → **kwargs。

python 复制代码

def func(a, b=10, *args, **kwargs):
    print(f"a = {a}")
    print(f"b = {b}")
    print(f"args = {args}")
    print(f"kwargs = {kwargs}")

func(1, 2, 3, 4, name="John", age=30)
# 输出:
# a = 1
# b = 2
# args = (3, 4)
# kwargs = {'name': 'John', 'age': 30}

4.3 返回值

函数可以使用return语句返回一个值或多个值。

4.3.1 基本返回值

python 复制代码

def square(x):
    return x * x

result = square(5)  # 25

4.3.2 多个返回值

Python允许函数返回多个值，实际上是返回一个元组。

python 复制代码

def get_name_and_age():
    return "John", 30

name, age = get_name_and_age()  # 解包元组
print(name)  # John
print(age)  # 30

4.3.3 无返回值的函数

如果函数没有return语句或return后面没有值，则默认返回None。

python 复制代码

def greet():
    print("Hello")

result = greet()
print(result)  # None

4.4 变量作用域

变量的作用域指的是变量可以被访问的代码区域。Python中有四种作用域：局部作用域、闭包作用域、全局作用域和内置作用域。

4.4.1 局部作用域

在函数内部定义的变量具有局部作用域，只能在函数内部访问。

python 复制代码

def func():
    x = 10  # 局部变量
    print(x)

func()  # 10
# print(x)  # 错误：NameError: name 'x' is not defined

4.4.2 全局作用域

在函数外部定义的变量具有全局作用域，可以在整个程序中访问，但在函数内部修改全局变量时需要使用global关键字。

python 复制代码

x = 10  # 全局变量

def func():
    print(x)  # 可以访问全局变量

def modify_global():
    global x  # 声明使用全局变量
    x = 20

func()  # 10
modify_global()
print(x)  # 20

4.4.3 nonlocal关键字

在嵌套函数中，可以使用nonlocal关键字来访问和修改外层函数的变量。

python 复制代码

def outer():
    x = 10  # 闭包变量
    
    def inner():
        nonlocal x  # 声明使用外层函数的变量
        x = 20
    
    inner()
    print(x)

outer()  # 20

4.5 匿名函数（Lambda函数）

Lambda函数是一种小型的匿名函数，可以用一行代码定义。

4.5.1 基本语法

python 复制代码

lambda arguments: expression

4.5.2 使用示例

python 复制代码

# 定义一个lambda函数
square = lambda x: x * x
print(square(5))  # 25

# 与其他函数一起使用
def apply(func, x):
    return func(x)

result = apply(lambda x: x * 2, 5)  # 10

# 排序时使用lambda
points = [(1, 2), (3, 1), (5, 0), (2, 4)]
points.sort(key=lambda p: p[1])  # 按第二个元素排序
print(points)  # [(5, 0), (3, 1), (1, 2), (2, 4)]

# 过滤时使用lambda
numbers = [1, 2, 3, 4, 5, 6]
even_numbers = list(filter(lambda x: x % 2 == 0, numbers))
print(even_numbers)  # [2, 4, 6]

# 映射时使用lambda
squared_numbers = list(map(lambda x: x ** 2, numbers))
print(squared_numbers)  # [1, 4, 9, 16, 25, 36]

4.6 函数作为一等公民

在Python中，函数是一等公民，意味着它们可以：

作为参数传递给其他函数
作为返回值从函数返回
赋值给变量
存储在数据结构中

python 复制代码

# 函数赋值给变量
def greet(name):
    return f"Hello, {name}!"

hello = greet
print(hello("Python"))  # Hello, Python!

# 函数作为参数
def apply(func, arg):
    return func(arg)

print(apply(greet, "World"))  # Hello, World!

# 函数作为返回值
def make_greeter(greeting):
    def greeter(name):
        return f"{greeting}, {name}!"
    return greeter

hi_greeter = make_greeter("Hi")
hello_greeter = make_greeter("Hello")

print(hi_greeter("Python"))  # Hi, Python!
print(hello_greeter("Python"))  # Hello, Python!

# 函数存储在数据结构中
functions = [greet, hi_greeter, hello_greeter]
for func in functions:
    print(func("Python"))

4.7 装饰器

装饰器是一个函数，用于修改另一个函数的功能，在不改变原函数代码的情况下增强函数的行为。

4.7.1 基本装饰器

python 复制代码

# 定义一个简单的装饰器
def my_decorator(func):
    def wrapper():
        print("函数执行前")
        func()
        print("函数执行后")
    return wrapper

# 使用装饰器
@my_decorator
def say_hello():
    print("Hello, World!")

# 调用被装饰的函数
say_hello()
# 输出:
# 函数执行前
# Hello, World!
# 函数执行后

4.7.2 带参数的装饰器

python 复制代码

def my_decorator(func):
    def wrapper(*args, **kwargs):
        print("函数执行前")
        result = func(*args, **kwargs)
        print("函数执行后")
        return result
    return wrapper

@my_decorator
def add(a, b):
    return a + b

result = add(3, 5)
print(result)  # 8
# 输出:
# 函数执行前
# 函数执行后
# 8

4.7.3 带参数的装饰器函数

python 复制代码

def repeat(n):
    def decorator(func):
        def wrapper(*args, **kwargs):
            for i in range(n):
                print(f"执行第 {i+1} 次")
                result = func(*args, **kwargs)
            return result
        return wrapper
    return decorator

@repeat(3)
def say_hello():
    print("Hello!")

say_hello()
# 输出:
# 执行第 1 次
# Hello!
# 执行第 2 次
# Hello!
# 执行第 3 次
# Hello!

4.7.4 常见装饰器示例

python 复制代码

# 计时装饰器
import time
def timer(func):
    def wrapper(*args, **kwargs):
        start = time.time()
        result = func(*args, **kwargs)
        end = time.time()
        print(f"{func.__name__} 执行时间: {end - start:.4f} 秒")
        return result
    return wrapper

@timer
def slow_function():
    time.sleep(1)
    return "Done"

result = slow_function()  # 输出: slow_function 执行时间: 1.0012 秒

# 日志装饰器
def logger(func):
    def wrapper(*args, **kwargs):
        print(f"调用 {func.__name__} 函数，参数: args={args}, kwargs={kwargs}")
        result = func(*args, **kwargs)
        print(f"{func.__name__} 返回值: {result}")
        return result
    return wrapper

@logger
def add(a, b):
    return a + b

result = add(3, 5)  # 输出调用信息和返回值

4.8 模块

模块是一个包含Python代码的文件，可以包含函数、类、变量和常量。使用模块可以使代码更加组织化和可重用。

4.8.1 创建和导入模块

创建一个名为my_module.py的文件：

python 复制代码

# my_module.py

def greet(name):
    """问候函数"""
    return f"Hello, {name}!"

pi = 3.14159265359

class Circle:
    def __init__(self, radius):
        self.radius = radius
    
    def area(self):
        return pi * self.radius ** 2

在另一个Python文件中导入和使用该模块：

python 复制代码

# 导入整个模块
import my_module

print(my_module.greet("Python"))  # Hello, Python!
print(my_module.pi)  # 3.14159265359
circle = my_module.Circle(5)
print(circle.area())  # 78.53981633975

# 从模块导入特定内容
from my_module import greet, pi

print(greet("World"))  # Hello, World!
print(pi)  # 3.14159265359

# 导入模块并使用别名
import my_module as mm

print(mm.greet("Python"))  # Hello, Python!

# 从模块导入特定内容并使用别名
from my_module import Circle as C

circle = C(5)
print(circle.area())  # 78.53981633975

# 导入模块的所有内容（不推荐）
from my_module import *

print(greet("Everyone"))  # Hello, Everyone!

4.8.2 `name`变量

每个Python模块都有一个特殊的变量__name__，它决定了模块是作为主程序运行还是被导入。

python 复制代码

# my_module.py

def greet(name):
    return f"Hello, {name}!"

if __name__ == "__main__":
    # 当模块直接运行时执行
    print(greet("Main"))
else:
    # 当模块被导入时执行
    print("模块被导入")

当直接运行my_module.py时，__name__的值是"__main__"；当从其他模块导入my_module时，__name__的值是模块名（"my_module"）。

4.9 包

包是一种组织Python模块的方式，是一个包含多个模块的目录，并且该目录包含一个__init__.py文件。

4.9.1 创建包

创建如下目录结构：

复制代码

my_package/
    __init__.py
    module1.py
    module2.py

__init__.py文件可以为空，也可以包含初始化代码：

python 复制代码

# my_package/__init__.py

__version__ = "1.0.0"

# 可以在__init__.py中导入子模块或函数，使它们在包级别可用
from .module1 import greet
from .module2 import calculate

创建模块文件：

python 复制代码

# my_package/module1.py
def greet(name):
    return f"Hello, {name}!"

# my_package/module2.py
def calculate(a, b):
    return a + b

4.9.2 导入包和模块

python 复制代码

# 导入整个包
import my_package

print(my_package.__version__)  # 1.0.0
print(my_package.greet("Python"))  # Hello, Python!
print(my_package.calculate(3, 5))  # 8

# 导入包中的模块
from my_package import module1, module2

print(module1.greet("World"))  # Hello, World!
print(module2.calculate(3, 5))  # 8

# 导入模块并使用别名
import my_package.module1 as m1

print(m1.greet("Python"))  # Hello, Python!

# 从包中的模块导入函数
from my_package.module1 import greet

print(greet("Everyone"))  # Hello, Everyone!

4.10 模块搜索路径

当导入一个模块时，Python会按照以下顺序搜索模块：

当前目录
sys.path中的目录（包括Python标准库路径、安装的第三方库路径等）

可以通过sys.path查看和修改模块搜索路径：

python 复制代码

import sys

# 查看模块搜索路径
for path in sys.path:
    print(path)

# 添加自定义路径到搜索路径
sys.path.append("/path/to/my/modules")

4.11 标准库导入示例

Python提供了丰富的标准库，可以通过导入使用：

python 复制代码

# 导入数学模块
import math

print(math.pi)  # 3.141592653589793
print(math.sqrt(16))  # 4.0
print(math.sin(math.pi/2))  # 1.0

# 导入随机模块
import random

print(random.random())  # 0.0 到 1.0 之间的随机浮点数
print(random.randint(1, 10))  # 1 到 10 之间的随机整数
print(random.choice([1, 2, 3, 4, 5]))  # 从列表中随机选择一个元素

# 导入时间模块
import time

print(time.time())  # 当前时间戳
print(time.localtime())  # 当前本地时间
time.sleep(1)  # 暂停1秒

# 导入操作系统相关模块
import os

print(os.getcwd())  # 获取当前工作目录
print(os.listdir())  # 列出当前目录下的文件和文件夹

4.12 相对导入和绝对导入

在包中，可以使用相对导入和绝对导入。

4.12.1 相对导入

相对导入使用点（.）表示相对路径：

.module 表示同一包中的模块
..module 表示父包中的模块
...module 表示祖父包中的模块

python 复制代码

# 在my_package/module1.py中
from . import module2  # 导入同一包中的module2
from .subpackage import module3  # 导入子包中的模块
from .. import parent_module  # 导入父包中的模块

4.12.2 绝对导入

绝对导入使用完整的包路径：

python 复制代码

# 绝对导入
import my_package.module1
from my_package import module2
from my_package.subpackage import module3

4.13 最佳实践

模块和包命名：使用小写字母和下划线，避免使用Python保留字。
避免循环导入：A导入B，B导入A。

使用__all__变量 ：在模块中定义__all__列表，明确指定使用from module import *时导入的内容。

python 复制代码

# my_module.py
__all__ = ["greet", "Circle"]  # 只有在__all__中的名称会被导入

def greet(name):
    return f"Hello, {name}!"

def internal_function():
    pass

class Circle:
    pass

使用相对导入和绝对导入：根据实际情况选择合适的导入方式，大型项目推荐使用绝对导入。
避免使用from module import *：可能导致命名冲突和代码可读性下降。

5. Python面向对象编程

5.1 面向对象编程概述

面向对象编程（OOP）是一种编程范式，它使用"对象"来设计应用程序和计算机程序。在Python中，几乎所有东西都是对象，具有属性和方法。

面向对象编程的主要特性包括：

封装：将数据和方法组合在一个单元（类）中
继承：允许一个类继承另一个类的属性和方法
多态：允许不同的类对同一消息做出不同的响应
抽象：隐藏复杂实现细节，只提供必要的接口

5.2 类和对象

5.2.1 类的定义

类是对象的蓝图或模板，定义了对象的属性（数据）和方法（行为）。

python 复制代码

class Person:
    """人员类"""
    
    # 类变量，所有实例共享
    species = "Homo sapiens"
    
    def __init__(self, name, age):
        """初始化方法"""
        # 实例变量，每个实例都有自己的副本
        self.name = name
        self.age = age
    
    def greet(self):
        """问候方法"""
        return f"Hello, my name is {self.name}."
    
    def celebrate_birthday(self):
        """庆祝生日方法"""
        self.age += 1
        return f"Happy Birthday! Now {self.name} is {self.age} years old."

5.2.2 对象的创建和使用

对象是类的实例。创建对象的过程称为实例化。

python 复制代码

# 创建Person类的实例
person1 = Person("Alice", 30)
person2 = Person("Bob", 25)

# 访问实例变量
print(person1.name)  # "Alice"
print(person1.age)  # 30

# 调用实例方法
print(person1.greet())  # "Hello, my name is Alice."
print(person1.celebrate_birthday())  # "Happy Birthday! Now Alice is 31 years old."

# 访问类变量
print(Person.species)  # "Homo sapiens"
print(person1.species)  # "Homo sapiens"

# 修改类变量
Person.species = "Human"
print(person1.species)  # "Human"
print(person2.species)  # "Human"

# 修改实例变量不会影响类变量
person1.species = "Individual Human"
print(person1.species)  # "Individual Human"
print(person2.species)  # "Human"
print(Person.species)  # "Human"

5.2.3 `init`方法

__init__方法是一个特殊的方法，在创建对象时自动调用，用于初始化对象的属性。它相当于其他语言中的构造函数。

python 复制代码

class Car:
    def __init__(self, brand, model, year):
        self.brand = brand
        self.model = model
        self.year = year
        self.odometer_reading = 0  # 默认值
    
    def get_descriptive_name(self):
        return f"{self.year} {self.brand} {self.model}"
    
    def read_odometer(self):
        return f"This car has {self.odometer_reading} miles on it."

my_car = Car("Audi", "A4", 2020)
print(my_car.get_descriptive_name())  # "2020 Audi A4"
print(my_car.read_odometer())  # "This car has 0 miles on it."

5.3 继承

继承允许我们创建一个新类，继承现有类的属性和方法。被继承的类称为父类或基类，新创建的类称为子类或派生类。

5.3.1 基本继承

python 复制代码

class Animal:
    """动物基类"""
    
    def __init__(self, name):
        self.name = name
    
    def speak(self):
        """动物发声方法"""
        return "Some generic sound"
    
    def move(self):
        """动物移动方法"""
        return f"{self.name} is moving"

class Dog(Animal):
    """狗类，继承自动物类"""
    
    def __init__(self, name, breed):
        # 调用父类的__init__方法
        super().__init__(name)
        self.breed = breed
    
    # 重写父类的speak方法
    def speak(self):
        return "Woof!"
    
    def fetch(self):
        """狗特有的方法"""
        return f"{self.name} is fetching"

# 创建Dog类的实例
my_dog = Dog("Rex", "German Shepherd")

# 访问继承的属性和方法
print(my_dog.name)  # "Rex"
print(my_dog.move())  # "Rex is moving"

# 访问子类特有的属性和方法
print(my_dog.breed)  # "German Shepherd"
print(my_dog.fetch())  # "Rex is fetching"

# 调用重写的方法
print(my_dog.speak())  # "Woof!"

# 调用父类的方法
print(super(Dog, my_dog).speak())  # "Some generic sound"

5.3.2 多重继承

Python支持多重继承，一个类可以继承多个父类。

python 复制代码

class Swimmer:
    def swim(self):
        return "Swimming"

class Flyer:
    def fly(self):
        return "Flying"

class Duck(Animal, Swimmer, Flyer):
    """鸭子类，继承自动物类、游泳者类和飞行者类"""
    
    def speak(self):
        return "Quack!"

# 创建Duck类的实例
my_duck = Duck("Donald")

# 调用继承的方法
print(my_duck.speak())  # "Quack!"
print(my_duck.swim())  # "Swimming"
print(my_duck.fly())  # "Flying"
print(my_duck.move())  # "Donald is moving"

5.3.3 方法解析顺序（MRO）

在多重继承中，当多个父类有同名方法时，Python会按照一定的顺序来查找方法，这个顺序称为方法解析顺序（MRO）。

python 复制代码

# 查看MRO
print(Duck.__mro__)  # (<class '__main__.Duck'>, <class '__main__.Animal'>, <class '__main__.Swimmer'>, <class '__main__.Flyer'>, <class 'object'>)

# 使用mro()方法查看
print(Duck.mro())  # 同上

5.4 多态

多态是指不同的对象对同一消息做出不同的响应。在Python中，多态是通过方法重写和动态类型实现的。

python 复制代码

class Cat(Animal):
    def speak(self):
        return "Meow!"

class Cow(Animal):
    def speak(self):
        return "Moo!"

# 创建不同的动物实例
animals = [Dog("Rex", "German Shepherd"), Cat("Whiskers"), Cow("Bessie")]

# 多态：同样的方法调用，不同的行为
for animal in animals:
    print(f"{animal.name} says: {animal.speak()}")

# 输出:
# Rex says: Woof!
# Whiskers says: Meow!
# Bessie says: Moo!

5.5 封装

封装是将数据和方法组合在一个单元（类）中，并控制对数据的访问。在Python中，虽然没有真正的私有成员，但可以使用命名约定来表示私有或受保护的成员。

5.5.1 命名约定

公共成员：正常命名的属性和方法，可以在任何地方访问
受保护成员 ：以单个下划线（_）开头，表示仅供类内部和子类使用
私有成员 ：以双下划线（__）开头，表示仅供类内部使用（Python会执行名称修饰）

python 复制代码

class Person:
    def __init__(self, name, age, salary):
        self.name = name  # 公共属性
        self._age = age  # 受保护属性
        self.__salary = salary  # 私有属性
    
    def get_salary(self):
        """获取工资（访问器方法）"""
        return self.__salary
    
    def set_salary(self, new_salary):
        """设置工资（修改器方法）"""
        if new_salary > 0:
            self.__salary = new_salary
        else:
            print("Invalid salary")

# 创建Person实例
person = Person("Alice", 30, 50000)

# 访问公共属性
print(person.name)  # "Alice"

# 访问受保护属性（不推荐，但技术上可以）
print(person._age)  # 30

# 尝试直接访问私有属性（会引发错误）
# print(person.__salary)  # AttributeError: 'Person' object has no attribute '__salary'

# 访问名称修饰后的属性（不推荐）
print(person._Person__salary)  # 50000

# 使用访问器和修改器方法
print(person.get_salary())  # 50000
person.set_salary(60000)
print(person.get_salary())  # 60000

5.5.2 属性装饰器

Python提供了属性装饰器（@property），可以将方法转换为属性访问，使代码更加简洁和Pythonic。

python 复制代码

class Person:
    def __init__(self, name, age, salary):
        self.name = name
        self._age = age
        self.__salary = salary
    
    @property
    def salary(self):
        """工资属性的getter方法"""
        return self.__salary
    
    @salary.setter
    def salary(self, new_salary):
        """工资属性的setter方法"""
        if new_salary > 0:
            self.__salary = new_salary
        else:
            print("Invalid salary")
    
    @property
    def can_vote(self):
        """只读属性，判断是否可以投票"""
        return self._age >= 18

# 使用属性装饰器
person = Person("Alice", 30, 50000)

# 使用属性语法访问
print(person.salary)  # 50000
person.salary = 60000  # 调用setter方法
print(person.salary)  # 60000

# 尝试设置无效的工资
person.salary = -1000  # 输出: Invalid salary

# 访问只读属性
print(person.can_vote)  # True

# 尝试修改只读属性（会引发错误）
# person.can_vote = False  # AttributeError: can't set attribute 'can_vote'

5.6 特殊方法（魔术方法）

特殊方法是Python中以双下划线开头和结尾的方法，用于实现对象的特殊行为。它们也被称为魔术方法（magic methods）或dunder方法（double underscore methods）。

5.6.1 常用特殊方法

python 复制代码

class Book:
    def __init__(self, title, author, pages):
        self.title = title
        self.author = author
        self.pages = pages
    
    def __str__(self):
        """定义对象的字符串表示（使用str()或print()时调用）"""
        return f"'{self.title}' by {self.author}"
    
    def __repr__(self):
        """定义对象的正式字符串表示（在解释器中直接显示对象时调用）"""
        return f"Book(title='{self.title}', author='{self.author}', pages={self.pages})"
    
    def __len__(self):
        """定义对象的长度（使用len()时调用）"""
        return self.pages
    
    def __eq__(self, other):
        """定义对象的相等比较（使用==时调用）"""
        if not isinstance(other, Book):
            return NotImplemented
        return self.title == other.title and self.author == other.author
    
    def __lt__(self, other):
        """定义对象的小于比较（使用<时调用）"""
        if not isinstance(other, Book):
            return NotImplemented
        return self.pages < other.pages

# 创建Book实例
book1 = Book("Python Crash Course", "Eric Matthes", 544)
book2 = Book("Python Crash Course", "Eric Matthes", 544)
book3 = Book("Fluent Python", "Luciano Ramalho", 792)

# 测试特殊方法
print(book1)  # 调用__str__: 'Python Crash Course' by Eric Matthes
print(repr(book1))  # 调用__repr__: Book(title='Python Crash Course', author='Eric Matthes', pages=544)
print(len(book1))  # 调用__len__: 544
print(book1 == book2)  # 调用__eq__: True
print(book1 == book3)  # 调用__eq__: False
print(book1 < book3)  # 调用__lt__: True
print(book3 > book1)  # Python会使用__lt__的逆操作: True

# 排序
books = [book3, book1, book2]
books.sort()
print([str(book) for book in books])
# 输出: ["'Python Crash Course' by Eric Matthes", "'Python Crash Course' by Eric Matthes", "'Fluent Python' by Luciano Ramalho"]

5.6.2 算术运算符特殊方法

python 复制代码

class Vector:
    def __init__(self, x, y):
        self.x = x
        self.y = y
    
    def __add__(self, other):
        """向量加法（使用+时调用）"""
        if isinstance(other, Vector):
            return Vector(self.x + other.x, self.y + other.y)
        return NotImplemented
    
    def __sub__(self, other):
        """向量减法（使用-时调用）"""
        if isinstance(other, Vector):
            return Vector(self.x - other.x, self.y - other.y)
        return NotImplemented
    
    def __mul__(self, other):
        """向量乘法（使用*时调用）"""
        if isinstance(other, (int, float)):
            # 标量乘法
            return Vector(self.x * other, self.y * other)
        elif isinstance(other, Vector):
            # 点积
            return self.x * other.x + self.y * other.y
        return NotImplemented
    
    def __abs__(self):
        """向量的模（使用abs()时调用）"""
        import math
        return math.sqrt(self.x ** 2 + self.y ** 2)
    
    def __str__(self):
        return f"Vector({self.x}, {self.y})"

# 测试向量运算
v1 = Vector(3, 4)
v2 = Vector(1, 2)

print(v1 + v2)  # Vector(4, 6)
print(v1 - v2)  # Vector(2, 2)
print(v1 * 2)  # Vector(6, 8)
print(v1 * v2)  # 11 (点积)
print(abs(v1))  # 5.0

5.7 类方法和静态方法

除了实例方法外，Python还支持类方法和静态方法。

5.7.1 类方法

类方法使用@classmethod装饰器定义，第一个参数是类本身（通常命名为cls）。类方法可以访问类变量，但不能访问实例变量。

python 复制代码

class Person:
    # 类变量
    population = 0
    
    def __init__(self, name):
        self.name = name
        Person.population += 1
    
    @classmethod
    def get_population(cls):
        """获取总人口数（类方法）"""
        return cls.population
    
    @classmethod
    def create_anonymous(cls):
        """创建匿名人员（工厂方法）"""
        return cls("Anonymous")

# 创建Person实例
p1 = Person("Alice")
p2 = Person("Bob")

# 调用类方法
print(Person.get_population())  # 2

# 使用工厂方法
p3 = Person.create_anonymous()
print(p3.name)  # "Anonymous"
print(Person.get_population())  # 3

5.7.2 静态方法

静态方法使用@staticmethod装饰器定义，不需要特殊参数。静态方法既不能访问类变量，也不能访问实例变量，它们就像普通函数一样，但被组织在类的命名空间中。

python 复制代码

class Math:
    @staticmethod
    def add(a, b):
        """静态加法方法"""
        return a + b
    
    @staticmethod
    def multiply(a, b):
        """静态乘法方法"""
        return a * b

# 调用静态方法
print(Math.add(3, 5))  # 8
print(Math.multiply(3, 5))  # 15

# 也可以通过实例调用静态方法（但不推荐）
m = Math()
print(m.add(3, 5))  # 8

5.8 抽象类

抽象类是一种不能直接实例化的类，用于定义接口和共享功能。在Python中，可以使用abc模块来创建抽象类。

python 复制代码

from abc import ABC, abstractmethod

class Shape(ABC):
    """形状抽象类"""
    
    @abstractmethod
    def area(self):
        """计算面积的抽象方法"""
        pass
    
    @abstractmethod
    def perimeter(self):
        """计算周长的抽象方法"""
        pass
    
    def display(self):
        """显示形状信息（具体方法）"""
        print(f"Area: {self.area()}")
        print(f"Perimeter: {self.perimeter()}")

class Rectangle(Shape):
    """矩形类，实现Shape抽象类"""
    
    def __init__(self, width, height):
        self.width = width
        self.height = height
    
    def area(self):
        return self.width * self.height
    
    def perimeter(self):
        return 2 * (self.width + self.height)

class Circle(Shape):
    """圆形类，实现Shape抽象类"""
    
    def __init__(self, radius):
        self.radius = radius
    
    def area(self):
        import math
        return math.pi * self.radius ** 2
    
    def perimeter(self):
        import math
        return 2 * math.pi * self.radius

# 创建具体子类的实例
rect = Rectangle(5, 3)
circle = Circle(4)

# 调用方法
rect.display()
# 输出:
# Area: 15
# Perimeter: 16

circle.display()
# 输出:
# Area: 50.26548245743669
# Perimeter: 25.132741228718345

# 尝试创建抽象类的实例（会引发错误）
# shape = Shape()  # TypeError: Can't instantiate abstract class Shape with abstract methods area, perimeter

5.9 多重继承和Mixins

Mixins是一种设计模式，用于通过组合功能来扩展类，而不是通过继承。在Python中，可以通过多重继承来实现Mixins。

python 复制代码

# 定义Mixins类
class SwimMixin:
    def swim(self):
        return f"{self.name} is swimming"

class FlyMixin:
    def fly(self):
        return f"{self.name} is flying"

class RunMixin:
    def run(self):
        return f"{self.name} is running"

# 使用Mixins组合功能
class Duck(Animal, SwimMixin, FlyMixin):
    def speak(self):
        return "Quack!"

class Dog(Animal, RunMixin):
    def speak(self):
        return "Woof!"

class Superhero(Animal, FlyMixin, RunMixin):
    def speak(self):
        return "I'm a superhero!"

# 测试组合的功能
duck = Duck("Donald")
dog = Dog("Rex")
superhero = Superhero("Superman")

print(duck.swim())  # "Donald is swimming"
print(duck.fly())  # "Donald is flying"
print(dog.run())  # "Rex is running"
print(superhero.fly())  # "Superman is flying"
print(superhero.run())  # "Superman is running"

5.10 元类

元类是创建类的类，它控制类的创建过程。Python中，类也是对象，而元类就是创建这些类对象的东西。

5.10.1 基本元类

python 复制代码

class MyMeta(type):
    """自定义元类"""
    
    def __new__(mcs, name, bases, attrs):
        """创建类时调用"""
        # 添加一个类属性
        attrs['created_by'] = 'MyMeta'
        
        # 处理方法名称（将方法名转换为大写）
        methods_to_uppercase = {}
        for key, value in attrs.items():
            if callable(value) and not key.startswith('__'):
                methods_to_uppercase[key.upper()] = value
            else:
                methods_to_uppercase[key] = value
        
        # 创建类
        return super().__new__(mcs, name, bases, methods_to_uppercase)
    
    def __init__(cls, name, bases, attrs):
        """初始化类时调用"""
        print(f"创建类: {name}")
        super().__init__(name, bases, attrs)

# 使用元类
class MyClass(metaclass=MyMeta):
    def hello(self):
        return "Hello, World!"

# 测试元类的效果
print(MyClass.created_by)  # "MyMeta"

# 注意方法名已经变成大写
my_instance = MyClass()
print(my_instance.HELLO())  # "Hello, World!"

# 原来的方法名不存在
# print(my_instance.hello())  # AttributeError

5.10.2 使用元类进行验证

python 复制代码

class ValidateAttributesMeta(type):
    """验证属性类型的元类"""
    
    def __new__(mcs, name, bases, attrs):
        # 查找类型注解
        annotations = attrs.get('__annotations__', {})
        
        # 定义验证方法
        def validate_attributes(self):
            """验证实例属性的类型"""
            for attr_name, attr_type in annotations.items():
                if hasattr(self, attr_name):
                    value = getattr(self, attr_name)
                    if not isinstance(value, attr_type):
                        raise TypeError(f"{attr_name} must be of type {attr_type.__name__}, got {type(value).__name__}")
            return True
        
        # 将验证方法添加到类中
        attrs['validate_attributes'] = validate_attributes
        
        # 重写__init__方法，在初始化后进行验证
        original_init = attrs.get('__init__', lambda self: None)
        
        def new_init(self, *args, **kwargs):
            original_init(self, *args, **kwargs)
            self.validate_attributes()
        
        attrs['__init__'] = new_init
        
        return super().__new__(mcs, name, bases, attrs)

# 使用元类
class Person(metaclass=ValidateAttributesMeta):
    name: str
    age: int
    
    def __init__(self, name, age):
        self.name = name
        self.age = age

# 测试验证
person1 = Person("Alice", 30)  # 有效

# 测试无效类型
try:
    person2 = Person("Bob", "thirty")  # 无效，age不是整数
except TypeError as e:
    print(e)  # age must be of type int, got str

5.11 面向对象编程最佳实践

单一职责原则：一个类应该只负责一个功能领域。
开闭原则：软件实体应该对扩展开放，对修改关闭。
里氏替换原则：子类应该能够替换父类而不改变程序的行为。
接口隔离原则：客户端不应该依赖它不需要的接口。
依赖倒置原则：高层模块不应该依赖低层模块，它们都应该依赖抽象。
使用组合优于继承：通过组合功能（如Mixins）比单纯的继承更灵活。
命名约定 ：类名使用大驼峰命名法（如MyClass），方法和属性使用小写字母和下划线（如my_method）。
使用文档字符串：为类和方法提供清晰的文档字符串。
使用属性装饰器 ：使用@property代替显式的getter和setter方法。
合理使用继承：只在真正的"is-a"关系中使用继承，避免过度设计。

6. Python异常处理和文件操作

6.1 异常处理概述

异常是在程序执行过程中发生的错误事件，它会中断正常的程序流程。Python提供了强大的异常处理机制，可以优雅地处理这些错误，而不是让程序崩溃。

常见的Python异常类型包括：

SyntaxError：语法错误
IndentationError：缩进错误
NameError：未定义的变量名
TypeError：类型错误，如将字符串和数字相加
ValueError：值错误，如尝试将非数字字符串转换为整数
ZeroDivisionError：除零错误
IndexError：索引越界错误
KeyError：字典键不存在错误
FileNotFoundError：文件未找到错误
IOError：输入/输出错误

6.2 基本异常处理

6.2.1 try-except语句

使用try-except语句可以捕获并处理异常：

python 复制代码

# 基本的try-except结构
try:
    # 可能引发异常的代码
    result = 10 / 0
except ZeroDivisionError:
    # 处理特定类型的异常
    print("不能除以零！")

# 输出: 不能除以零！

在上面的例子中，我们尝试执行一个可能引发ZeroDivisionError的操作，并在except块中提供了处理该异常的代码。

6.2.2 捕获多种异常

可以捕获多种类型的异常，有两种方式：

python 复制代码

# 方式1：使用多个except块
try:
    number = int(input("请输入一个数字: "))
    result = 10 / number
except ValueError:
    print("输入的不是有效数字！")
except ZeroDivisionError:
    print("不能除以零！")

# 方式2：在一个except块中捕获多种异常
try:
    number = int(input("请输入一个数字: "))
    result = 10 / number
except (ValueError, ZeroDivisionError) as e:
    print(f"发生错误: {e}")

第二种方式允许我们通过as关键字访问异常对象，这样我们就可以获取有关异常的详细信息。

6.2.3 捕获所有异常

可以使用通用的Exception类来捕获所有非系统退出的异常：

python 复制代码

try:
    # 可能引发任何异常的代码
    number = int(input("请输入一个数字: "))
    result = 10 / number
    numbers = [1, 2, 3]
    print(numbers[number])
except Exception as e:
    print(f"发生错误: {type(e).__name__}: {e}")

注意：虽然捕获所有异常可能很方便，但通常不推荐这样做，因为它会隐藏编程错误。最好只捕获你能处理的特定异常。

6.3 else和finally子句

6.3.1 else子句

try-except语句可以包含一个else子句，该子句在try块中没有引发异常时执行：

python 复制代码

try:
    number = int(input("请输入一个数字: "))
    result = 10 / number
except (ValueError, ZeroDivisionError) as e:
    print(f"发生错误: {e}")
else:
    print(f"计算结果: {result}")

使用else子句的好处是可以将可能引发异常的代码与不会引发异常的代码分开，使代码更加清晰。

6.3.2 finally子句

try-except语句可以包含一个finally子句，该子句无论是否发生异常都会执行：

python 复制代码

try:
    file = open("example.txt", "r")
    content = file.read()
    print(content)
except FileNotFoundError:
    print("文件不存在！")
finally:
    # 无论是否发生异常，都会尝试关闭文件
    if 'file' in locals() and not file.closed:
        file.close()
        print("文件已关闭")

finally子句通常用于清理资源，如关闭文件、释放网络连接等。

6.4 自定义异常

Python允许我们定义自己的异常类，这些类应该继承自内置的Exception类或其子类：

python 复制代码

# 定义自定义异常
class InsufficientFundsError(Exception):
    """当余额不足时引发的异常"""
    def __init__(self, balance, amount):
        self.balance = balance
        self.amount = amount
        self.message = f"余额不足: 可用余额 {balance}, 需要 {amount}"
        super().__init__(self.message)

class BankAccount:
    def __init__(self, balance=0):
        self.balance = balance
    
    def withdraw(self, amount):
        if amount > self.balance:
            raise InsufficientFundsError(self.balance, amount)
        self.balance -= amount
        return self.balance

# 使用自定义异常
account = BankAccount(100)
try:
    account.withdraw(200)
except InsufficientFundsError as e:
    print(f"提款失败: {e}")
    print(f"余额: {e.balance}")
    print(f"尝试提款: {e.amount}")

6.5 异常的传递

当一个函数内部发生异常但没有捕获时，异常会向上传递到调用该函数的地方。如果在整个调用链中都没有捕获异常，最终程序会终止并显示错误信息。

python 复制代码

def divide(a, b):
    return a / b

def calculate(a, b):
    try:
        return divide(a, b)
    except ZeroDivisionError:
        print("在calculate函数中捕获到除零错误")
        raise  # 重新引发异常

try:
    result = calculate(10, 0)
except ZeroDivisionError:
    print("在主程序中捕获到除零错误")

在上面的例子中，divide函数中发生的ZeroDivisionError被calculate函数捕获，然后又被重新引发，最终被主程序捕获。

6.6 抛出异常

可以使用raise语句显式地抛出异常：

python 复制代码

def validate_age(age):
    if not isinstance(age, int):
        raise TypeError("年龄必须是整数")
    if age < 0:
        raise ValueError("年龄不能为负数")
    if age > 150:
        raise ValueError("年龄不合理")
    return True

try:
    validate_age(-5)
except ValueError as e:
    print(f"验证失败: {e}")

6.7 文件操作概述

文件操作是编程中常见的任务，Python提供了简单而强大的文件操作功能。基本的文件操作包括：

打开文件
读取文件内容
写入文件内容
关闭文件

6.8 文件的打开和关闭

6.8.1 使用open()函数打开文件

在Python中，使用内置的open()函数来打开文件：

python 复制代码

# 基本语法：open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True, opener=None)

# 打开文件进行读取
file = open("example.txt", "r")  # 'r'表示读取模式

# 使用完毕后关闭文件
file.close()

6.8.2 文件模式

open()函数的mode参数指定了打开文件的模式：

模式	描述
`'r'`	只读模式（默认）
`'w'`	写入模式，如果文件存在则截断，如果不存在则创建
`'x'`	独占创建模式，如果文件已存在则失败
`'a'`	追加模式，如果文件不存在则创建
`'b'`	二进制模式
`'t'`	文本模式（默认）
`'+'`	更新模式（读写）

可以组合这些模式，例如：

'rb'：二进制只读模式
'w+'：读写模式，如果文件存在则截断
'a+'：追加和读取模式

6.8.3 使用with语句自动关闭文件

使用with语句（上下文管理器）可以确保文件在使用完毕后自动关闭，即使发生异常：

python 复制代码

# 使用with语句（推荐方式）
with open("example.txt", "r") as file:
    content = file.read()
    print(content)
# 在这里，文件已经自动关闭，不需要调用file.close()

使用with语句是处理文件的推荐方式，因为它更安全，代码也更简洁。

6.9 文件的读取操作

6.9.1 读取整个文件

python 复制代码

with open("example.txt", "r") as file:
    # 读取整个文件内容
    content = file.read()
    print(content)

6.9.2 逐行读取

python 复制代码

with open("example.txt", "r") as file:
    # 逐行读取
    for line in file:
        print(line.strip())  # strip()移除行尾的换行符和空白字符

# 或者将所有行读取到列表中
with open("example.txt", "r") as file:
    lines = file.readlines()
    for line in lines:
        print(line.strip())

6.9.3 读取指定大小的数据

python 复制代码

with open("example.txt", "r") as file:
    # 读取指定大小的数据（单位：字符）
    chunk = file.read(100)  # 读取前100个字符
    while chunk:
        print(chunk)
        chunk = file.read(100)  # 继续读取接下来的100个字符

6.10 文件的写入操作

6.10.1 写入文本

python 复制代码

# 使用'w'模式写入（会覆盖现有内容）
with open("output.txt", "w") as file:
    file.write("Hello, World!\n")
    file.write("这是第二行文本。\n")

# 使用'a'模式追加
with open("output.txt", "a") as file:
    file.write("这是追加的文本。\n")

# 写入多行
lines = ["第一行\n", "第二行\n", "第三行\n"]
with open("output.txt", "w") as file:
    file.writelines(lines)

6.10.2 写入二进制数据

python 复制代码

# 写入二进制数据
with open("binary.dat", "wb") as file:
    # 创建一些二进制数据
    data = bytes([72, 101, 108, 108, 111])  # ASCII码的"Hello"
    file.write(data)

6.11 文件指针操作

文件指针是指向文件中当前位置的指针，可以通过以下方法操作：

python 复制代码

with open("example.txt", "r+") as file:  # 使用r+模式进行读写
    # 读取前10个字符
    print(file.read(10))
    
    # 获取当前文件指针位置
    position = file.tell()
    print(f"当前位置: {position}")
    
    # 将文件指针移动到文件开头
    file.seek(0)
    
    # 再次读取前5个字符
    print(file.read(5))
    
    # 将文件指针移动到文件末尾
    file.seek(0, 2)  # 第二个参数0表示相对于文件开头，1表示相对于当前位置，2表示相对于文件末尾
    
    # 在文件末尾写入内容
    file.write("\n这是在文件末尾添加的内容。")

6.12 文件和目录管理

Python的os和os.path模块提供了文件和目录管理的功能：

python 复制代码

import os
import os.path

# 获取当前工作目录
current_dir = os.getcwd()
print(f"当前工作目录: {current_dir}")

# 创建目录
new_dir = os.path.join(current_dir, "new_folder")
os.makedirs(new_dir, exist_ok=True)  # exist_ok=True表示如果目录已存在则不抛出异常

# 检查文件或目录是否存在
print(f"new_folder是否存在: {os.path.exists(new_dir)}")
print(f"example.txt是否存在: {os.path.exists('example.txt')}")

# 检查是否是文件或目录
print(f"new_folder是否是目录: {os.path.isdir(new_dir)}")
print(f"example.txt是否是文件: {os.path.isfile('example.txt')}")

# 获取文件大小
if os.path.isfile('example.txt'):
    file_size = os.path.getsize('example.txt')
    print(f"example.txt的大小: {file_size} 字节")

# 获取文件的绝对路径
abs_path = os.path.abspath('example.txt')
print(f"example.txt的绝对路径: {abs_path}")

# 列出目录内容
print("当前目录内容:")
for item in os.listdir(current_dir):
    item_path = os.path.join(current_dir, item)
    if os.path.isdir(item_path):
        print(f"[目录] {item}")
    else:
        print(f"[文件] {item}")

# 重命名文件或目录
old_name = os.path.join(new_dir, "old_name.txt")
new_name = os.path.join(new_dir, "new_name.txt")

# 先创建一个文件用于测试
with open(old_name, "w") as f:
    f.write("测试文件")

# 重命名文件
os.rename(old_name, new_name)
print(f"文件已从 {old_name} 重命名为 {new_name}")

# 删除文件
os.remove(new_name)
print(f"文件 {new_name} 已删除")

# 删除目录（目录必须为空）
os.rmdir(new_dir)
print(f"目录 {new_dir} 已删除")

6.13 使用pathlib进行文件操作

Python 3.4及以上版本引入了pathlib模块，它提供了一个面向对象的API来处理文件和目录路径：

python 复制代码

from pathlib import Path

# 创建Path对象
current_dir = Path.cwd()
print(f"当前工作目录: {current_dir}")

# 创建新目录
new_dir = current_dir / "new_folder"
new_dir.mkdir(exist_ok=True)

# 创建文件路径
file_path = new_dir / "example.txt"

# 写入文件
with file_path.open("w") as file:
    file.write("Hello from pathlib!\n")

# 读取文件
content = file_path.read_text()
print(f"文件内容: {content}")

# 检查文件或目录是否存在
print(f"new_folder是否存在: {new_dir.exists()}")
print(f"example.txt是否存在: {file_path.exists()}")

# 检查是否是文件或目录
print(f"new_folder是否是目录: {new_dir.is_dir()}")
print(f"example.txt是否是文件: {file_path.is_file()}")

# 获取文件大小
if file_path.is_file():
    print(f"example.txt的大小: {file_path.stat().st_size} 字节")

# 获取绝对路径
print(f"example.txt的绝对路径: {file_path.absolute()}")

# 列出目录内容
print("new_folder目录内容:")
for item in new_dir.iterdir():
    if item.is_dir():
        print(f"[目录] {item.name}")
    else:
        print(f"[文件] {item.name}")

# 重命名文件
new_file_path = new_dir / "renamed.txt"
file_path.rename(new_file_path)
print(f"文件已重命名为: {new_file_path.name}")

# 删除文件
new_file_path.unlink()
print(f"文件 {new_file_path.name} 已删除")

# 删除目录
new_dir.rmdir()
print(f"目录 {new_dir.name} 已删除")

# 路径连接
parts = ["folder1", "folder2", "file.txt"]
path = Path("/")  # 根目录
for part in parts:
    path = path / part
print(f"连接后的路径: {path}")  # /folder1/folder2/file.txt

# 路径分解
path = Path("/home/user/documents/file.txt")
print(f"路径: {path}")
print(f"父目录: {path.parent}")
print(f"文件名: {path.name}")
print(f"扩展名: {path.suffix}")
print(f"无扩展名的文件名: {path.stem}")

6.14 处理CSV文件

CSV（逗号分隔值）是一种常见的文件格式，用于存储表格数据。Python的csv模块提供了读写CSV文件的功能：

python 复制代码

import csv

# 写入CSV文件
with open("data.csv", "w", newline="") as file:
    writer = csv.writer(file)
    # 写入表头
    writer.writerow(["姓名", "年龄", "城市"])
    # 写入数据行
    writer.writerows([
        ["张三", 25, "北京"],
        ["李四", 30, "上海"],
        ["王五", 35, "广州"]
    ])

# 读取CSV文件
print("CSV文件内容:")
with open("data.csv", "r", newline="") as file:
    reader = csv.reader(file)
    for row in reader:
        print(row)

# 使用DictWriter写入CSV文件
with open("dict_data.csv", "w", newline="") as file:
    fieldnames = ["姓名", "年龄", "城市"]
    writer = csv.DictWriter(file, fieldnames=fieldnames)
    
    # 写入表头
    writer.writeheader()
    
    # 写入数据行
    writer.writerow({"姓名": "张三", "年龄": 25, "城市": "北京"})
    writer.writerow({"姓名": "李四", "年龄": 30, "城市": "上海"})
    writer.writerow({"姓名": "王五", "年龄": 35, "城市": "广州"})

# 使用DictReader读取CSV文件
print("\n使用DictReader读取CSV文件:")
with open("dict_data.csv", "r", newline="") as file:
    reader = csv.DictReader(file)
    for row in reader:
        print(f"姓名: {row['姓名']}, 年龄: {row['年龄']}, 城市: {row['城市']}")

6.15 处理JSON文件

JSON（JavaScript对象表示法）是一种轻量级的数据交换格式。Python的json模块提供了处理JSON数据的功能：

python 复制代码

import json

# 示例数据
data = {
    "name": "张三",
    "age": 25,
    "city": "北京",
    "skills": ["Python", "Java", "JavaScript"],
    "is_student": False
}

# 写入JSON文件
with open("data.json", "w", encoding="utf-8") as file:
    # indent参数用于格式化输出（美化）
    json.dump(data, file, ensure_ascii=False, indent=4)

# 读取JSON文件
with open("data.json", "r", encoding="utf-8") as file:
    loaded_data = json.load(file)
    print("从JSON文件加载的数据:")
    print(loaded_data)
    print(f"姓名: {loaded_data['name']}")
    print(f"技能: {', '.join(loaded_data['skills'])}")

# JSON字符串和Python对象之间的转换
# Python对象转JSON字符串
json_string = json.dumps(data, ensure_ascii=False, indent=4)
print("\nJSON字符串:")
print(json_string)

# JSON字符串转Python对象
python_object = json.loads(json_string)
print("\n从JSON字符串转换回Python对象:")
print(python_object)

6.16 文件操作的最佳实践

始终使用with语句：确保文件在使用完毕后自动关闭，即使发生异常。
指定编码：在处理文本文件时，始终指定编码，特别是处理非ASCII字符时。
错误处理 ：捕获并处理可能的文件操作异常，如FileNotFoundError、PermissionError等。
检查文件是否存在：在尝试打开或操作文件之前，检查文件是否存在。
使用pathlib ：对于Python 3.4及以上版本，推荐使用pathlib模块进行文件路径操作，它提供了更现代、更面向对象的API。
使用适当的文件模式：根据需要选择正确的文件打开模式。
资源清理：确保所有资源在使用完毕后都被正确释放。
处理大文件：对于大文件，使用分块读取或逐行读取，而不是一次性读取整个文件。
文件锁定：在多线程或多进程环境中，考虑使用文件锁定机制来避免并发问题。
日志记录：记录文件操作的重要事件，便于调试和监控。

6.17 示例：文件复制程序

下面是一个简单的文件复制程序，演示了异常处理和文件操作的综合使用：

python 复制代码

import os

def copy_file(source_path, destination_path):
    """
    复制文件
    
    Args:
        source_path: 源文件路径
        destination_path: 目标文件路径
    
    Returns:
        bool: 如果复制成功返回True，否则返回False
    """
    try:
        # 检查源文件是否存在
        if not os.path.isfile(source_path):
            print(f"错误: 源文件 '{source_path}' 不存在")
            return False
        
        # 检查目标目录是否存在，如果不存在则创建
        destination_dir = os.path.dirname(destination_path)
        if destination_dir and not os.path.exists(destination_dir):
            os.makedirs(destination_dir)
            print(f"已创建目录: {destination_dir}")
        
        # 复制文件
        with open(source_path, 'rb') as source_file:
            with open(destination_path, 'wb') as dest_file:
                # 分块读取和写入，适用于大文件
                chunk_size = 4096  # 4KB
                while True:
                    chunk = source_file.read(chunk_size)
                    if not chunk:
                        break
                    dest_file.write(chunk)
        
        # 验证复制是否成功
        if os.path.getsize(source_path) == os.path.getsize(destination_path):
            print(f"文件已成功复制: {source_path} -> {destination_path}")
            return True
        else:
            print(f"错误: 文件大小不匹配，复制可能不完整")
            return False
    
    except PermissionError:
        print(f"错误: 权限被拒绝，无法访问或写入文件")
        return False
    except IOError as e:
        print(f"错误: IO错误 - {e}")
        return False
    except Exception as e:
        print(f"错误: 发生意外错误 - {e}")
        return False

# 使用示例
if __name__ == "__main__":
    source = "example.txt"
    destination = "backup/copied_example.txt"
    
    # 如果源文件不存在，先创建一个
    if not os.path.isfile(source):
        with open(source, 'w') as f:
            f.write("这是一个测试文件。\n它包含一些文本内容。")
        print(f"已创建测试文件: {source}")
    
    # 复制文件
    success = copy_file(source, destination)
    
    if success:
        print("复制操作成功完成!")
    else:
        print("复制操作失败!")

6.18 示例：文件分析程序

下面是一个简单的文件分析程序，可以统计文本文件中的行数、单词数和字符数：

python 复制代码

def analyze_file(file_path):
    """
    分析文本文件的基本信息
    
    Args:
        file_path: 文件路径
    
    Returns:
        dict: 包含文件分析结果的字典
    """
    try:
        with open(file_path, 'r', encoding='utf-8') as file:
            lines = file.readlines()
        
        # 统计行数
        line_count = len(lines)
        
        # 统计单词数和字符数
        word_count = 0
        char_count = 0
        for line in lines:
            # 统计字符数（包括空格和换行符）
            char_count += len(line)
            # 统计单词数（简单地按空格分割）
            words = line.split()
            word_count += len(words)
        
        # 统计非空行数
        non_empty_lines = sum(1 for line in lines if line.strip())
        
        # 返回分析结果
        return {
            'file_path': file_path,
            'line_count': line_count,
            'non_empty_line_count': non_empty_lines,
            'word_count': word_count,
            'char_count': char_count
        }
    
    except FileNotFoundError:
        print(f"错误: 文件 '{file_path}' 不存在")
        return None
    except UnicodeDecodeError:
        print(f"错误: 无法解码文件 '{file_path}'，请确保它是文本文件")
        return None
    except Exception as e:
        print(f"错误: 分析文件时发生错误 - {e}")
        return None

# 使用示例
if __name__ == "__main__":
    file_path = "example.txt"
    
    # 如果文件不存在，先创建一个
    if not os.path.isfile(file_path):
        with open(file_path, 'w', encoding='utf-8') as f:
            f.write("这是第一行。\n")
            f.write("这是第二行，包含一些单词。\n")
            f.write("\n")  # 空行
            f.write("这是第四行，用于测试文件分析程序。")
        print(f"已创建测试文件: {file_path}")
    
    # 分析文件
    result = analyze_file(file_path)
    
    if result:
        print(f"\n文件分析结果:")
        print(f"文件路径: {result['file_path']}")
        print(f"总行数: {result['line_count']}")
        print(f"非空行数: {result['non_empty_line_count']}")
        print(f"单词数: {result['word_count']}")
        print(f"字符数: {result['char_count']}")

复制代码

## 7. Python标准库和常用模块

### 7.1 标准库概述

Python标准库是Python的核心部分，提供了丰富的模块和函数，可以帮助我们完成各种任务，而无需安装额外的第三方库。标准库涵盖了文件处理、网络编程、数据库操作、文本处理、数学计算等多个领域。

### 7.2 常用标准库模块

#### 7.2.1 sys模块

sys模块提供了访问Python解释器相关变量和函数的接口。

```python
import sys

# 命令行参数
print(f"命令行参数: {sys.argv}")

# Python解释器版本信息
print(f"Python版本: {sys.version}")

# 退出程序
sys.exit(0)  # 0表示正常退出

# 标准输出重定向
with open('output.txt', 'w') as f:
    sys.stdout = f
    print("这条消息将被写入文件")
sys.stdout = sys.__stdout__  # 恢复标准输出

# 模块搜索路径
print("模块搜索路径:")
for path in sys.path:
    print(f"  {path}")

7.2.2 os模块

os模块提供了与操作系统交互的功能，如文件和目录操作、进程管理等。

python 复制代码

import os

# 获取当前工作目录
current_dir = os.getcwd()
print(f"当前工作目录: {current_dir}")

# 改变工作目录
os.chdir('..')
print(f"新的工作目录: {os.getcwd()}")
os.chdir(current_dir)  # 恢复

# 创建目录
os.makedirs('test_folder', exist_ok=True)

# 列出目录内容
print("当前目录内容:")
for item in os.listdir('.'):
    print(f"  {item}")

# 检查文件/目录是否存在
print(f"test_folder是否存在: {os.path.exists('test_folder')}")
print(f"test_folder是否是目录: {os.path.isdir('test_folder')}")

# 获取环境变量
path = os.environ.get('PATH')
print(f"PATH环境变量: {path}")

# 执行系统命令
result = os.system('dir' if os.name == 'nt' else 'ls -la')
print(f"命令执行结果: {result}")

# 删除目录
os.rmdir('test_folder')

7.2.3 math模块

math模块提供了数学计算相关的函数。

python 复制代码

import math

# 基本数学函数
print(f"绝对值: {math.fabs(-10)}")
print(f"向上取整: {math.ceil(4.2)}")
print(f"向下取整: {math.floor(4.8)}")
print(f"四舍五入: {round(4.5)}")

# 幂和对数
print(f"2的3次方: {math.pow(2, 3)}")
print(f"平方根: {math.sqrt(16)}")
print(f"自然对数: {math.log(math.e)}")
print(f"以10为底的对数: {math.log10(100)}")

# 三角函数
print(f"正弦(π/2): {math.sin(math.pi/2)}")
print(f"余弦(0): {math.cos(0)}")
print(f"正切(π/4): {math.tan(math.pi/4)}")

# 常量
print(f"π: {math.pi}")
print(f"自然对数的底e: {math.e}")

7.2.4 random模块

random模块提供了生成随机数的功能。

python 复制代码

import random

# 生成0到1之间的随机浮点数
print(f"随机浮点数: {random.random()}")

# 生成指定范围内的随机整数
print(f"1到100的随机整数: {random.randint(1, 100)}")

# 生成指定步长的随机数
print(f"0到100之间的偶数: {random.randrange(0, 101, 2)}")

# 从序列中随机选择一个元素
colors = ['红色', '绿色', '蓝色', '黄色']
print(f"随机颜色: {random.choice(colors)}")

# 从序列中随机选择指定数量的元素（可重复）
print(f"随机颜色(有放回): {random.choices(colors, k=3)}")

# 从序列中随机选择指定数量的元素（不可重复）
print(f"随机颜色(无放回): {random.sample(colors, k=2)}")

# 打乱序列
random.shuffle(colors)
print(f"打乱后的颜色: {colors}")

# 设置随机种子
random.seed(42)  # 使用相同的种子可以产生相同的随机序列

7.2.5 datetime模块

datetime模块提供了处理日期和时间的类和函数。

python 复制代码

import datetime

# 获取当前日期和时间
now = datetime.datetime.now()
print(f"当前时间: {now}")

# 获取当前日期
today = datetime.date.today()
print(f"当前日期: {today}")

# 创建特定的日期和时间
specific_date = datetime.date(2023, 10, 1)
specific_datetime = datetime.datetime(2023, 10, 1, 12, 30, 45)
print(f"特定日期: {specific_date}")
print(f"特定日期时间: {specific_datetime}")

# 日期时间的属性
print(f"年份: {now.year}")
print(f"月份: {now.month}")
print(f"日: {now.day}")
print(f"小时: {now.hour}")
print(f"分钟: {now.minute}")
print(f"秒: {now.second}")
print(f"星期几: {now.weekday()}")  # 0=星期一, 6=星期日

# 日期时间格式化
formatted = now.strftime("%Y-%m-%d %H:%M:%S")
print(f"格式化时间: {formatted}")
print(f"简短格式: {now.strftime('%d/%m/%y %H:%M')}")

# 解析字符串为日期时间
date_string = "2023-12-25 08:00:00"
parsed = datetime.datetime.strptime(date_string, "%Y-%m-%d %H:%M:%S")
print(f"解析后的时间: {parsed}")

# 时间差
delta = datetime.timedelta(days=10, hours=5, minutes=30)
future_date = now + delta
print(f"10天后的时间: {future_date}")

# 计算两个日期之间的差
diff = future_date - now
print(f"时间差: {diff.days}天, {diff.seconds//3600}小时")

7.2.6 collections模块

collections模块提供了额外的容器数据类型，扩展了内置的列表、字典、集合等类型。

python 复制代码

from collections import Counter, defaultdict, deque, namedtuple, OrderedDict

# Counter: 计数
text = "hello python programming"
counter = Counter(text)
print(f"字符计数: {counter}")
print(f"出现频率最高的3个字符: {counter.most_common(3)}")

# defaultdict: 默认值字典
colors = [('red', 1), ('blue', 2), ('red', 3), ('green', 4)]
dd = defaultdict(list)
for color, value in colors:
    dd[color].append(value)
print(f"默认值字典: {dict(dd)}")

# deque: 双端队列
queue = deque([1, 2, 3, 4, 5])
queue.append(6)  # 右侧添加
queue.appendleft(0)  # 左侧添加
print(f"双端队列: {queue}")
queue.pop()  # 右侧删除
queue.popleft()  # 左侧删除
print(f"操作后: {queue}")
queue.rotate(1)  # 向右旋转
print(f"旋转后: {queue}")

# namedtuple: 命名元组
Person = namedtuple('Person', ['name', 'age', 'city'])
person = Person('Alice', 30, 'Beijing')
print(f"命名元组: {person}")
print(f"姓名: {person.name}")
print(f"年龄: {person.age}")

# OrderedDict: 有序字典（Python 3.7+中普通字典也保持插入顺序）
od = OrderedDict()
od['a'] = 1
od['b'] = 2
od['c'] = 3
print(f"有序字典: {od}")
od.move_to_end('a')  # 移动到末尾
print(f"移动后: {od}")

7.2.7 itertools模块

itertools模块提供了用于高效循环的迭代器函数。

python 复制代码

import itertools

# 无限迭代器
# count: 从指定值开始的无限整数序列
for i in itertools.count(10, 2):  # 从10开始，步长为2
    print(i, end=' ')
    if i >= 20:
        break
print()

# cycle: 无限循环指定序列
count = 0
for item in itertools.cycle(['A', 'B', 'C']):
    print(item, end=' ')
    count += 1
    if count >= 10:
        break
print()

# repeat: 重复指定值
for item in itertools.repeat('Python', 3):  # 重复3次
    print(item, end=' ')
print()

# 有限迭代器
# accumulate: 累加
numbers = [1, 2, 3, 4, 5]
print(f"累加: {list(itertools.accumulate(numbers))}")

# chain: 连接多个迭代器
letters = ['a', 'b', 'c']
numbers = [1, 2, 3]
print(f"连接: {list(itertools.chain(letters, numbers))}")

# combinations: 组合（不考虑顺序）
print(f"组合: {list(itertools.combinations('ABC', 2))}")

# permutations: 排列（考虑顺序）
print(f"排列: {list(itertools.permutations('ABC', 2))}")

# product: 笛卡尔积
print(f"笛卡尔积: {list(itertools.product('AB', '12'))}")

# filterfalse: 过滤出不满足条件的元素
print(f"偶数: {list(itertools.filterfalse(lambda x: x % 2, range(10)))}")

7.2.8 re模块

re模块提供了正则表达式的功能，用于字符串的匹配、查找和替换。

python 复制代码

import re

# 基本匹配
pattern = r'\d+'  # 匹配一个或多个数字
text = "我有100元，你有200元"
match = re.search(pattern, text)
if match:
    print(f"找到匹配: {match.group()}")

# 查找所有匹配
matches = re.findall(pattern, text)
print(f"所有匹配: {matches}")

# 替换
replaced = re.sub(r'\d+', '***', text)
print(f"替换后: {replaced}")

# 分割
text = "apple,orange,banana"
split_result = re.split(r',', text)
print(f"分割结果: {split_result}")

# 编译正则表达式
pattern = re.compile(r'[a-zA-Z]+')
text = "Hello 123 World 456"
print(f"匹配英文字母: {pattern.findall(text)}")

# 分组
pattern = r'(\d{4})-(\d{2})-(\d{2})'
text = "今天是2023-10-01，明天是2023-10-02"
match = re.search(pattern, text)
if match:
    print(f"日期: {match.group(0)}")
    print(f"年: {match.group(1)}")
    print(f"月: {match.group(2)}")
    print(f"日: {match.group(3)}")

7.2.9 json模块

json模块用于JSON数据的编码和解码。

python 复制代码

import json

# Python对象转JSON字符串
person = {
    'name': '张三',
    'age': 30,
    'city': '北京',
    'hobbies': ['读书', '旅行', '编程'],
    'is_student': False
}

json_str = json.dumps(person, ensure_ascii=False, indent=4)
print(f"JSON字符串:\n{json_str}")

# JSON字符串转Python对象
python_obj = json.loads(json_str)
print(f"Python对象: {python_obj}")
print(f"姓名: {python_obj['name']}")
print(f"爱好: {python_obj['hobbies']}")

# 写入JSON文件
with open('person.json', 'w', encoding='utf-8') as f:
    json.dump(person, f, ensure_ascii=False, indent=4)

# 读取JSON文件
with open('person.json', 'r', encoding='utf-8') as f:
    loaded_person = json.load(f)
print(f"从文件加载: {loaded_person}")

7.2.10 csv模块

csv模块提供了读写CSV文件的功能。

python 复制代码

import csv

# 写入CSV文件
with open('students.csv', 'w', newline='', encoding='utf-8') as f:
    writer = csv.writer(f)
    # 写入表头
    writer.writerow(['姓名', '年龄', '成绩'])
    # 写入数据
    writer.writerows([
        ['张三', 20, 85],
        ['李四', 21, 90],
        ['王五', 19, 78]
    ])

# 读取CSV文件
print("CSV文件内容:")
with open('students.csv', 'r', encoding='utf-8') as f:
    reader = csv.reader(f)
    for row in reader:
        print(row)

# 使用DictWriter
with open('scores.csv', 'w', newline='', encoding='utf-8') as f:
    fieldnames = ['姓名', '数学', '语文', '英语']
    writer = csv.DictWriter(f, fieldnames=fieldnames)
    writer.writeheader()
    writer.writerow({'姓名': '张三', '数学': 95, '语文': 88, '英语': 92})
    writer.writerow({'姓名': '李四', '数学': 88, '语文': 92, '英语': 85})

# 使用DictReader
print("\n使用DictReader读取:")
with open('scores.csv', 'r', encoding='utf-8') as f:
    reader = csv.DictReader(f)
    for row in reader:
        print(f"{row['姓名']}: 数学={row['数学']}, 语文={row['语文']}, 英语={row['英语']}")

7.2.11 time模块

time模块提供了时间相关的函数。

python 复制代码

import time

# 获取当前时间戳
current_time = time.time()
print(f"当前时间戳: {current_time}")

# 时间戳转本地时间
time_tuple = time.localtime(current_time)
print(f"本地时间元组: {time_tuple}")

# 格式化时间
formatted_time = time.strftime("%Y-%m-%d %H:%M:%S", time_tuple)
print(f"格式化时间: {formatted_time}")

# 字符串转时间元组
time_str = "2023-10-01 12:30:45"
time_tuple = time.strptime(time_str, "%Y-%m-%d %H:%M:%S")
print(f"解析后的时间元组: {time_tuple}")

# 暂停程序执行
time.sleep(1)  # 暂停1秒

# 性能测试
start_time = time.time()
# 执行一些操作
for i in range(1000000):
    pass
end_time = time.time()
print(f"操作耗时: {end_time - start_time:.6f}秒")

7.2.12 threading模块

threading模块提供了多线程编程的功能。

python 复制代码

import threading
import time

# 线程函数
def print_numbers():
    for i in range(1, 6):
        print(f"数字: {i}")
        time.sleep(0.5)

def print_letters():
    for letter in 'ABCDE':
        print(f"字母: {letter}")
        time.sleep(0.7)

# 创建线程
t1 = threading.Thread(target=print_numbers)
t2 = threading.Thread(target=print_letters)

# 启动线程
t1.start()
t2.start()

# 等待线程结束
t1.join()
t2.join()

print("所有线程执行完毕")

# 使用线程锁
counter = 0
lock = threading.Lock()

def increment_counter():
    global counter
    for _ in range(100000):
        with lock:  # 加锁
            counter += 1

t3 = threading.Thread(target=increment_counter)
t4 = threading.Thread(target=increment_counter)

t3.start()
t4.start()

t3.join()
t4.join()

print(f"最终计数器值: {counter}")

7.2.13 queue模块

queue模块提供了线程安全的队列数据结构。

python 复制代码

import queue
import threading
import time

# 创建队列
q = queue.Queue(maxsize=10)

# 生产者函数
def producer():
    for i in range(1, 11):
        item = f"项目-{i}"
        q.put(item)
        print(f"生产者: 放入 {item}")
        time.sleep(0.3)

# 消费者函数
def consumer():
    while True:
        try:
            # 等待1秒获取项目
            item = q.get(timeout=1)
            print(f"消费者: 取出 {item}")
            q.task_done()  # 标记任务完成
            time.sleep(0.5)
        except queue.Empty:
            print("队列空，消费者退出")
            break

# 创建线程
producer_thread = threading.Thread(target=producer)
consumer_thread = threading.Thread(target=consumer)

# 启动线程
producer_thread.start()
consumer_thread.start()

# 等待生产者完成
producer_thread.join()

# 等待队列清空
q.join()

print("所有任务完成")

7.2.14 subprocess模块

subprocess模块用于创建新进程、连接到它们的输入/输出/错误管道，并获取它们的返回码。

python 复制代码

import subprocess
import sys

# 执行简单命令
result = subprocess.run(['echo', 'Hello, World!'], capture_output=True, text=True)
print(f"命令输出: {result.stdout}")
print(f"返回码: {result.returncode}")

# 执行shell命令（Windows下可能需要调整）
if sys.platform.startswith('win'):
    result = subprocess.run('dir', shell=True, capture_output=True, text=True)
else:
    result = subprocess.run('ls -la', shell=True, capture_output=True, text=True)

print(f"目录列表:\n{result.stdout}")

# 交互式进程
process = subprocess.Popen(
    ['python', '-c', 'print("请输入您的名字:"); name = input(); print(f"你好, {name}!")'],
    stdin=subprocess.PIPE,
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE,
    text=True
)

output, error = process.communicate(input="张三\n")
print(f"程序输出:\n{output}")
if error:
    print(f"错误:\n{error}")
print(f"返回码: {process.returncode}")

7.3 常用第三方库

除了标准库外，Python还有丰富的第三方库，可以通过pip安装。以下是一些常用的第三方库：

7.3.1 requests - HTTP请求库

requests库使发送HTTP请求变得简单和人性化。

python 复制代码

# 安装: pip install requests
import requests

# 发送GET请求
response = requests.get('https://api.github.com/users/github')
print(f"状态码: {response.status_code}")
print(f"响应内容: {response.json()}")

# 发送POST请求
data = {'name': '张三', 'age': 30}
response = requests.post('https://httpbin.org/post', data=data)
print(f"POST响应: {response.json()}")

# 添加请求头
headers = {'User-Agent': 'Mozilla/5.0'}
response = requests.get('https://httpbin.org/headers', headers=headers)

# 处理异常
try:
    response = requests.get('https://nonexistenturl.example.com', timeout=5)
    response.raise_for_status()  # 如果状态码不是200，抛出异常
except requests.exceptions.RequestException as e:
    print(f"请求异常: {e}")

7.3.2 BeautifulSoup - HTML解析库

BeautifulSoup库用于解析HTML和XML文档，提取数据。

python 复制代码

# 安装: pip install beautifulsoup4 lxml
from bs4 import BeautifulSoup
import requests

# 获取网页内容
response = requests.get('https://python.org/')
html_content = response.text

# 解析HTML
soup = BeautifulSoup(html_content, 'lxml')

# 查找元素
print(f"标题: {soup.title.text}")

# 查找所有链接
print("\n所有链接:")
for link in soup.find_all('a')[:5]:  # 只显示前5个链接
    print(f"文本: {link.text.strip()}")
    print(f"URL: {link.get('href')}")

# 使用CSS选择器
print("\n使用CSS选择器:")
for section in soup.select('.shrubbery')[:2]:
    print(f"章节标题: {section.h2.text}")

7.3.3 NumPy - 科学计算库

NumPy是Python科学计算的基础库，提供了高性能的多维数组对象和相关工具。

python 复制代码

# 安装: pip install numpy
import numpy as np

# 创建数组
arr1 = np.array([1, 2, 3, 4, 5])
arr2 = np.array([[1, 2, 3], [4, 5, 6]])

print(f"一维数组: {arr1}")
print(f"二维数组:\n{arr2}")
print(f"数组形状: {arr2.shape}")
print(f"数组维度: {arr2.ndim}")
print(f"数组元素类型: {arr2.dtype}")

# 数组操作
print(f"\n数组加法: {arr1 + 10}")
print(f"数组乘法: {arr1 * 2}")
print(f"数组平方: {arr1 ** 2}")

# 数学函数
print(f"\n正弦值: {np.sin(arr1)}")
print(f"平均值: {np.mean(arr2)}")
print(f"总和: {np.sum(arr2, axis=0)}")  # 按列求和
print(f"最大值: {np.max(arr2, axis=1)}")  # 按行求最大值

# 数组索引和切片
print(f"\n第一个元素: {arr1[0]}")
print(f"子数组:\n{arr2[0:2, 1:3]}")

# 特殊数组
zeros = np.zeros((2, 3))
ones = np.ones((3, 2))
identity = np.eye(3)
random = np.random.rand(2, 3)  # 随机数数组

print(f"\n零矩阵:\n{zeros}")
print(f"一矩阵:\n{ones}")
print(f"单位矩阵:\n{identity}")
print(f"随机数矩阵:\n{random}")

7.3.4 pandas - 数据分析库

pandas是一个强大的数据分析和操作库，提供了DataFrame和Series等数据结构。

python 复制代码

# 安装: pip install pandas
import pandas as pd

# 创建Series
s = pd.Series([10, 20, 30, 40, 50], index=['a', 'b', 'c', 'd', 'e'])
print(f"Series:\n{s}")

# 创建DataFrame
data = {
    '姓名': ['张三', '李四', '王五', '赵六'],
    '年龄': [25, 30, 35, 40],
    '城市': ['北京', '上海', '广州', '深圳'],
    '工资': [10000, 15000, 12000, 18000]
}
df = pd.DataFrame(data)
print(f"\nDataFrame:\n{df}")

# 基本操作
print(f"\n前两行:\n{df.head(2)}")
print(f"\n后两行:\n{df.tail(2)}")
print(f"\n基本统计:\n{df.describe()}")
print(f"\n按年龄排序:\n{df.sort_values('年龄')}")

# 数据选择
print(f"\n选择姓名列:\n{df['姓名']}")
print(f"\n选择前两行的姓名和工资:\n{df.loc[:1, ['姓名', '工资']]}")

# 条件过滤
high_salary = df[df['工资'] > 12000]
print(f"\n高工资员工:\n{high_salary}")

# 添加新列
df['年薪'] = df['工资'] * 12
print(f"\n添加年薪列:\n{df}")

# 读取CSV文件
df.to_csv('employees.csv', index=False, encoding='utf-8')
new_df = pd.read_csv('employees.csv')
print(f"\n从CSV读取:\n{new_df}")

7.3.5 matplotlib - 数据可视化库

matplotlib是一个用于创建静态、动态和交互式可视化的Python库。

python 复制代码

# 安装: pip install matplotlib
import matplotlib.pyplot as plt
import numpy as np

# 创建数据
x = np.linspace(0, 10, 100)
y1 = np.sin(x)
y2 = np.cos(x)

# 创建图形
plt.figure(figsize=(10, 6))

# 绘制线条
plt.plot(x, y1, label='sin(x)', color='blue', linewidth=2)
plt.plot(x, y2, label='cos(x)', color='red', linestyle='--')

# 添加标题和标签
plt.title('Sin and Cos Functions')
plt.xlabel('x values')
plt.ylabel('y values')

# 添加图例
plt.legend()

# 添加网格
plt.grid(True)

# 保存图形
plt.savefig('sin_cos_plot.png')

# 显示图形
plt.show()

# 绘制柱状图
data = [5, 7, 9, 11, 13]
labels = ['A', 'B', 'C', 'D', 'E']

plt.figure(figsize=(8, 5))
plt.bar(labels, data, color=['red', 'green', 'blue', 'yellow', 'orange'])
plt.title('Bar Chart Example')
plt.xlabel('Categories')
plt.ylabel('Values')
plt.show()

# 绘制散点图
x = np.random.rand(50)
y = np.random.rand(50)
colors = np.random.rand(50)
sizes = 1000 * np.random.rand(50)

plt.figure(figsize=(8, 8))
plt.scatter(x, y, c=colors, s=sizes, alpha=0.5, cmap='viridis')
plt.colorbar()
plt.title('Scatter Plot Example')
plt.show()

7.3.6 Django - Web框架

Django是一个高级的Python Web框架，鼓励快速开发和干净、实用的设计。

python 复制代码

# 安装: pip install django

# 创建Django项目（在命令行中）
# django-admin startproject myproject
# cd myproject
# python manage.py startapp myapp

# 在myapp/views.py中的示例
"""
from django.http import HttpResponse
from django.shortcuts import render

def index(request):
    return HttpResponse("Hello, Django!")

def template_view(request):
    context = {'name': 'Python', 'version': '3.10'}
    return render(request, 'myapp/index.html', context)
"""

# 在myproject/urls.py中的示例
"""
from django.contrib import admin
from django.urls import path
from myapp import views

urlpatterns = [
    path('admin/', admin.site.urls),
    path('', views.index, name='index'),
    path('template/', views.template_view, name='template'),
]
"""

7.3.7 Flask - 轻量级Web框架

Flask是一个轻量级的Python Web框架，被称为"微框架"，因为它不依赖特定的工具或库。

python 复制代码

# 安装: pip install flask

# Flask应用示例
"""
from flask import Flask, render_template, request, jsonify

app = Flask(__name__)

@app.route('/')
def hello():
    return "Hello, Flask!"

@app.route('/greet/<name>')
def greet(name):
    return f"Hello, {name}!"

@app.route('/template')
def template():
    return render_template('index.html', name='Python')

@app.route('/api/data', methods=['GET', 'POST'])
def api_data():
    if request.method == 'POST':
        data = request.get_json()
        return jsonify({'status': 'success', 'received': data})
    return jsonify({'message': 'GET request'})

if __name__ == '__main__':
    app.run(debug=True)
"""

7.3.8 SQLAlchemy - ORM框架

SQLAlchemy是Python SQL工具包和对象关系映射器，为应用程序开发人员提供SQL的全部功能和灵活性。

python 复制代码

# 安装: pip install sqlalchemy

# SQLAlchemy示例
"""
from sqlalchemy import create_engine, Column, Integer, String, ForeignKey
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker, relationship

# 创建引擎
engine = create_engine('sqlite:///example.db')

# 创建基类
Base = declarative_base()

# 定义模型
class User(Base):
    __tablename__ = 'users'
    
    id = Column(Integer, primary_key=True)
    name = Column(String)
    age = Column(Integer)
    posts = relationship("Post", back_populates="user")

class Post(Base):
    __tablename__ = 'posts'
    
    id = Column(Integer, primary_key=True)
    title = Column(String)
    content = Column(String)
    user_id = Column(Integer, ForeignKey('users.id'))
    user = relationship("User", back_populates="posts")

# 创建表
Base.metadata.create_all(engine)

# 创建会话
Session = sessionmaker(bind=engine)
session = Session()

# 添加数据
user = User(name='张三', age=30)
session.add(user)

post = Post(title='第一篇文章', content='这是内容', user=user)
session.add(post)

session.commit()

# 查询数据
users = session.query(User).all()
for user in users:
    print(f"用户: {user.name}, 年龄: {user.age}")
    for post in user.posts:
        print(f"  文章: {post.title}")
"""

7.4 标准库和第三方库的安装与管理

7.4.1 pip包管理器

pip是Python的包安装器，可以用来安装、升级和卸载Python包。

bash 复制代码

# 安装包
pip install numpy pandas matplotlib

# 安装特定版本的包
pip install requests==2.27.1

# 升级包
pip install --upgrade numpy

# 卸载包
pip uninstall matplotlib

# 列出已安装的包
pip list

# 导出已安装的包到requirements.txt文件
pip freeze > requirements.txt

# 从requirements.txt文件安装包
pip install -r requirements.txt

# 显示包的详细信息
pip show numpy

7.4.2 使用虚拟环境

虚拟环境可以为不同的项目创建隔离的Python环境，避免包版本冲突。

bash 复制代码

# 使用venv（Python 3.3+）创建虚拟环境
python -m venv myenv

# 激活虚拟环境（Windows）
myenv\Scripts\activate

# 激活虚拟环境（Linux/macOS）
source myenv/bin/activate

# 退出虚拟环境
deactivate

# 使用pipenv创建和管理虚拟环境
pip install pipenv
pipenv install numpy
pipenv shell

7.5 库的选择与使用建议

优先使用标准库：标准库是Python的核心部分，经过严格测试，稳定性高，不需要额外安装。
选择流行和维护良好的第三方库：查看GitHub上的星标数、fork数、更新频率等，选择社区活跃的库。
注意版本兼容性 ：使用requirements.txt或Pipfile锁定依赖版本，确保项目在不同环境中能够正常运行。
阅读官方文档：每个库都有详细的官方文档，是学习和使用的最佳资源。
了解性能特性：不同的库在性能方面可能有差异，根据项目需求选择合适的库。
避免重复造轮子：在开发新功能前，先检查是否有现成的库可以使用。
关注安全更新：定期更新依赖包，修复已知的安全漏洞。
考虑学习成本：对于团队项目，选择学习曲线平缓的库可以提高开发效率。

8. Python高级特性和最佳实践

8.1 生成器

生成器是一种特殊的迭代器，可以更简洁地创建迭代器。生成器使用yield语句而不是return语句，每次调用yield时，函数会暂停并返回值，下次调用时从暂停的地方继续执行。

8.1.1 基本生成器函数

python 复制代码

def countdown(n):
    """生成一个从n倒计时到0的生成器"""
    while n >= 0:
        yield n
        n -= 1

# 使用生成器
for i in countdown(5):
    print(i, end=' ')
print()

# 生成器对象
gen = countdown(3)
print(next(gen))  # 输出: 3
print(next(gen))  # 输出: 2
print(next(gen))  # 输出: 1
print(next(gen))  # 输出: 0
# print(next(gen))  # 抛出 StopIteration 异常

8.1.2 生成器表达式

生成器表达式是创建生成器的一种简洁方式，类似于列表推导式，但使用圆括号而不是方括号。

python 复制代码

# 生成器表达式
squares = (x**2 for x in range(1, 6))

print("生成器表达式结果:")
for square in squares:
    print(square, end=' ')
print()

# 列表推导式与生成器表达式的区别
list_squares = [x**2 for x in range(1000)]  # 创建整个列表，占用内存
print(f"列表推导式内存占用估算: {len(list_squares)} 个元素")

gen_squares = (x**2 for x in range(1000))  # 按需生成，内存占用小
print("生成器表达式按需生成元素")

8.1.3 使用生成器处理大数据

生成器非常适合处理大量数据或无限序列，因为它不会一次性将所有数据加载到内存中。

python 复制代码

def read_large_file(file_path):
    """逐行读取大文件"""
    with open(file_path, 'r', encoding='utf-8') as file:
        for line in file:  # 逐行读取，内存友好
            yield line.strip()

# 示例：处理日志文件
def process_logs(file_path):
    for line in read_large_file(file_path):
        if 'ERROR' in line:
            yield line

# 无限序列生成器
def fibonacci():
    a, b = 0, 1
    while True:
        yield a
        a, b = b, a + b

# 使用无限生成器
fib = fibonacci()
print("斐波那契数列前10个数:")
for _ in range(10):
    print(next(fib), end=' ')
print()

8.1.4 生成器与协程

生成器可以用作简单的协程，通过send()方法向生成器发送值。

python 复制代码

def echo():
    print("开始协程")
    while True:
        received = yield  # 接收发送的值
        print(f"收到: {received}")

coroutine = echo()
next(coroutine)  # 启动协程，执行到第一个yield

coroutine.send("Hello")
coroutine.send("Python")
coroutine.close()  # 关闭协程

def accumulate():
    total = 0
    while True:
        value = yield total  # 发送累计值并接收新值
        if value is None:
            break
        total += value
    return total

acc = accumulate()
next(acc)  # 启动协程

print(f"加10后: {acc.send(10)}")
print(f"加20后: {acc.send(20)}")
print(f"加30后: {acc.send(30)}")

try:
    acc.send(None)  # 结束协程
 except StopIteration as e:
    print(f"最终结果: {e.value}")

8.2 装饰器高级用法

装饰器是一个函数，用于修改另一个函数的行为。装饰器可以在不修改原函数代码的情况下，增加额外的功能。

8.2.1 多个装饰器

python 复制代码

def log_entry(func):
    def wrapper(*args, **kwargs):
        print(f"开始调用 {func.__name__}")
        return func(*args, **kwargs)
    return wrapper

def log_exit(func):
    def wrapper(*args, **kwargs):
        result = func(*args, **kwargs)
        print(f"结束调用 {func.__name__}")
        return result
    return wrapper

@log_exit
@log_entry  # 装饰器的应用顺序是自下而上
@timer
def fibonacci(n):
    """计算斐波那契数列第n个数"""
    if n <= 1:
        return n
    return fibonacci(n-1) + fibonacci(n-2)

print(f"fibonacci(10) = {fibonacci(10)}")

8.2.2 参数化装饰器

python 复制代码

def retry(max_retries=3):
    def decorator(func):
        def wrapper(*args, **kwargs):
            retries = 0
            while retries < max_retries:
                try:
                    return func(*args, **kwargs)
                except Exception as e:
                    retries += 1
                    print(f"尝试 {retries}/{max_retries} 失败: {e}")
                    if retries >= max_retries:
                        raise
            return func(*args, **kwargs)
        return wrapper
    return decorator

@retry(max_retries=5)
def unstable_operation():
    import random
    if random.random() < 0.7:
        raise ValueError("随机失败")
    return "操作成功"

try:
    result = unstable_operation()
    print(result)
except Exception as e:
    print(f"最终失败: {e}")

8.2.3 类装饰器

使用类作为装饰器，需要实现__call__方法。

python 复制代码

class CountCalls:
    def __init__(self, func):
        self.func = func
        self.calls = 0
        self.__name__ = func.__name__
        self.__doc__ = func.__doc__
    
    def __call__(self, *args, **kwargs):
        self.calls += 1
        print(f"函数 {self.func.__name__} 已调用 {self.calls} 次")
        return self.func(*args, **kwargs)

@CountCalls
def greet(name):
    """问候函数"""
    return f"你好, {name}!"

print(greet("张三"))
print(greet("李四"))
print(greet("王五"))
print(f"文档: {greet.__doc__}")

8.2.4 functools.wraps

使用functools.wraps装饰器可以保留原函数的元数据。

python 复制代码

import functools

def my_decorator(func):
    @functools.wraps(func)  # 保留原函数的元数据
    def wrapper(*args, **kwargs):
        """包装函数"""
        print(f"调用 {func.__name__}")
        return func(*args, **kwargs)
    return wrapper

@my_decorator
def original_function():
    """原始函数文档"""
    return "原始函数结果"

print(f"函数名: {original_function.__name__}")
print(f"文档: {original_function.__doc__}")
print(original_function())

8.2.5 常用装饰器示例

python 复制代码

# 1. 缓存装饰器
import functools

def memoize(func):
    cache = {}
    
    @functools.wraps(func)
    def wrapper(*args):
        if args not in cache:
            cache[args] = func(*args)
        return cache[args]
    
    return wrapper

@memoize
def fibonacci(n):
    if n <= 1:
        return n
    return fibonacci(n-1) + fibonacci(n-2)

# 2. 访问控制装饰器
def require_permission(permission):
    def decorator(func):
        @functools.wraps(func)
        def wrapper(user, *args, **kwargs):
            if permission not in user.get('permissions', []):
                raise PermissionError(f"需要 {permission} 权限")
            return func(user, *args, **kwargs)
        return wrapper
    return decorator

@require_permission('admin')
def delete_user(user, user_id):
    return f"用户 {user_id} 已被删除"

# 测试
admin_user = {'name': '管理员', 'permissions': ['admin']}
regular_user = {'name': '普通用户', 'permissions': ['user']}

try:
    print(delete_user(admin_user, 1))
    print(delete_user(regular_user, 1))
except PermissionError as e:
    print(e)

8.3 上下文管理器

上下文管理器允许你在有需要时精确地分配和释放资源。最常见的用法是with语句。

8.3.1 使用with语句

python 复制代码

# 文件操作
try:
    file = open('example.txt', 'w')
    file.write('Hello, World!')
finally:
    file.close()

# 使用with语句更简洁
with open('example.txt', 'w') as file:
    file.write('Hello, World!')

# 多个上下文管理器
with open('input.txt', 'r') as input_file, open('output.txt', 'w') as output_file:
    for line in input_file:
        output_file.write(line.upper())

8.3.2 自定义上下文管理器

通过实现__enter__和__exit__方法，可以创建自定义的上下文管理器。

python 复制代码

class Timer:
    def __enter__(self):
        import time
        self.start_time = time.time()
        return self
    
    def __exit__(self, exc_type, exc_val, exc_tb):
        import time
        self.end_time = time.time()
        self.elapsed_time = self.end_time - self.start_time
        print(f"执行时间: {self.elapsed_time:.6f} 秒")
        # 如果返回True，表示异常已处理，不会向上传播
        return False

# 使用自定义上下文管理器
with Timer():
    result = sum(i for i in range(1000000))
    print(f"结果: {result}")

# 异常处理示例
with Timer():
    1/0  # 会打印执行时间，然后抛出异常

8.3.3 contextlib模块

contextlib模块提供了更简洁的方式来创建上下文管理器。

python 复制代码

import contextlib

@contextlib.contextmanager
def managed_file(name, mode):
    try:
        file = open(name, mode)
        yield file  # 产出资源给with语句
    finally:
        file.close()

with managed_file('example.txt', 'w') as f:
    f.write('使用contextmanager创建的上下文管理器')

# 临时改变工作目录
@contextlib.contextmanager
def change_dir(new_dir):
    import os
    old_dir = os.getcwd()
    try:
        os.chdir(new_dir)
        yield
    finally:
        os.chdir(old_dir)

# 使用closing
class Resource:
    def __init__(self):
        print("资源初始化")
    
    def close(self):
        print("资源关闭")

with contextlib.closing(Resource()):
    print("使用资源")

8.4 元编程

元编程是编写能够操作代码本身的代码。Python提供了多种元编程工具，如元类、描述符、反射等。

8.4.1 元类

元类是创建类的类。默认的元类是type。

python 复制代码

# 使用type创建类
Person = type('Person', (object,), {
    'name': '未知',
    'age': 0,
    'greet': lambda self: f"你好，我是{self.name}，今年{self.age}岁"
})

person = Person()
person.name = '张三'
person.age = 30
print(person.greet())

# 自定义元类
class Meta(type):
    def __new__(mcs, name, bases, attrs):
        # 添加一个类属性
        attrs['created_by'] = '元类'
        # 修改方法名（如果存在）
        if 'greet' in attrs:
            original_greet = attrs['greet']
            def wrapped_greet(self):
                print("元类包装的greet方法")
                return original_greet(self)
            attrs['greet'] = wrapped_greet
        return super().__new__(mcs, name, bases, attrs)

class Person(metaclass=Meta):
    def __init__(self, name):
        self.name = name
    
    def greet(self):
        return f"你好，我是{self.name}"

person = Person("李四")
print(person.greet())
print(f"created_by: {Person.created_by}")

8.4.2 描述符

描述符是实现了__get__、__set__或__delete__方法的对象，用于自定义属性访问。

python 复制代码

class ReadOnly:
    def __init__(self, name):
        self.name = name
    
    def __get__(self, instance, owner):
        if instance is None:
            return self
        return instance.__dict__[self.name]
    
    def __set__(self, instance, value):
        raise AttributeError("不能修改只读属性")

class Person:
    name = ReadOnly('_name')
    
    def __init__(self, name):
        self._name = name

person = Person("王五")
print(person.name)  # 可以访问
try:
    person.name = "赵六"  # 不能修改
except AttributeError as e:
    print(e)

# 属性验证器
class ValidatedAttribute:
    def __init__(self, name, validator=None):
        self.name = name
        self.validator = validator or (lambda x: True)
    
    def __get__(self, instance, owner):
        if instance is None:
            return self
        return instance.__dict__[self.name]
    
    def __set__(self, instance, value):
        if not self.validator(value):
            raise ValueError(f"无效值: {value}")
        instance.__dict__[self.name] = value

class User:
    age = ValidatedAttribute('_age', lambda x: isinstance(x, int) and 0 <= x <= 120)
    
    def __init__(self, name, age):
        self.name = name
        self.age = age

user = User("张三", 30)
print(f"{user.name}, {user.age}岁")
try:
    user.age = 200  # 无效值
except ValueError as e:
    print(e)

8.4.3 反射

反射是指在运行时检查或修改对象属性和方法的能力。

python 复制代码

class Person:
    def __init__(self, name, age):
        self.name = name
        self._age = age  # 私有属性
    
    def greet(self):
        return f"你好，我是{self.name}"

person = Person("张三", 30)

# 获取属性
print(f"name属性: {getattr(person, 'name')}")
print(f"是否有email属性: {hasattr(person, 'email')}")

# 设置属性
setattr(person, 'email', 'zhangsan@example.com')
print(f"email属性: {person.email}")

# 获取方法并调用
greet_method = getattr(person, 'greet')
print(greet_method())

# 获取类名和模块名
print(f"类名: {person.__class__.__name__}")
print(f"模块名: {person.__class__.__module__}")

# 获取所有属性和方法
print("\n所有属性和方法:")
for attr in dir(person):
    if not attr.startswith('__'):
        print(f"  {attr}")

8.5 函数式编程

函数式编程是一种编程范式，强调将计算视为数学函数的求值，避免状态变化和可变数据。

8.5.1 lambda表达式

lambda表达式用于创建小型匿名函数。

python 复制代码

# 基本用法
square = lambda x: x**2
print(f"5的平方: {square(5)}")

# 在高阶函数中使用
numbers = [1, 2, 3, 4, 5]
squared_numbers = list(map(lambda x: x**2, numbers))
print(f"平方后的列表: {squared_numbers}")

# 过滤
even_numbers = list(filter(lambda x: x % 2 == 0, numbers))
print(f"偶数列表: {even_numbers}")

# 排序
people = [('张三', 30), ('李四', 25), ('王五', 35)]
people.sort(key=lambda x: x[1])  # 按年龄排序
print(f"按年龄排序: {people}")

8.5.2 高阶函数

高阶函数是接受一个或多个函数作为参数，或者返回一个函数的函数。

python 复制代码

# map: 对序列中的每个元素应用函数
numbers = [1, 2, 3, 4, 5]
doubled = list(map(lambda x: x * 2, numbers))
print(f"映射后: {doubled}")

# filter: 过滤序列中符合条件的元素
even = list(filter(lambda x: x % 2 == 0, numbers))
print(f"过滤后: {even}")

# reduce: 对序列中的元素累积应用函数
from functools import reduce
sum_result = reduce(lambda x, y: x + y, numbers)
print(f"累积和: {sum_result}")

# sorted: 排序
words = ['apple', 'banana', 'cherry', 'date']
sorted_words = sorted(words, key=lambda x: len(x))
print(f"按长度排序: {sorted_words}")

8.5.3 函数组合

函数组合是将多个函数组合成一个新函数的过程。

python 复制代码

def compose(f, g):
    """函数组合: f(g(x))"""
    return lambda x: f(g(x))

def double(x):
    return x * 2

def increment(x):
    return x + 1

# 先increment后double
double_after_increment = compose(double, increment)
print(f"先加1后乘2: {double_after_increment(5)}")

# 先double后increment
increment_after_double = compose(increment, double)
print(f"先乘2后加1: {increment_after_double(5)}")

# 使用functools.reduce进行多函数组合
from functools import reduce

def compose_multiple(*funcs):
    """组合多个函数"""
    return reduce(lambda f, g: lambda x: f(g(x)), funcs)

# 先加1，再乘2，最后减3
result = compose_multiple(lambda x: x - 3, double, increment)(5)
print(f"组合结果: {result}")

8.5.4 闭包

闭包是指函数可以记住并访问其词法作用域之外的变量，即使当该函数在其定义作用域之外被执行时。

python 复制代码

def outer(x):
    def inner(y):
        return x + y  # inner函数访问了outer函数的变量x
    return inner

add_five = outer(5)
print(f"5 + 3 = {add_five(3)}")
print(f"5 + 7 = {add_five(7)}")

# 使用闭包创建计数器
def make_counter():
    count = 0
    
    def increment():
        nonlocal count
        count += 1
        return count
    
    def get_count():
        nonlocal count
        return count
    
    return {'increment': increment, 'get_count': get_count}

counter = make_counter()
print(f"计数: {counter['increment']()}")
print(f"计数: {counter['increment']()}")
print(f"当前计数: {counter['get_count']()}")

8.6 并发编程

Python提供了多种并发编程模型，包括多线程、多进程和异步编程。

8.6.1 多线程

Python的多线程由于GIL（全局解释器锁）的存在，在CPU密集型任务中并不能真正并行，但在I/O密集型任务中仍然有效。

python 复制代码

import threading
import time
import concurrent.futures

# 基本多线程
def worker(task_id, sleep_time):
    print(f"任务 {task_id} 开始执行")
    time.sleep(sleep_time)
    print(f"任务 {task_id} 执行完毕，耗时 {sleep_time} 秒")
    return task_id, sleep_time

# 创建并启动线程
threads = []
for i in range(5):
    t = threading.Thread(target=worker, args=(i, 1 + i % 3))
    threads.append(t)
    t.start()

# 等待所有线程完成
for t in threads:
    t.join()

print("所有线程执行完毕")

# 使用线程池
with concurrent.futures.ThreadPoolExecutor(max_workers=3) as executor:
    # 提交任务
    future_to_task = {executor.submit(worker, i, 1 + i % 3): i for i in range(5)}
    
    # 等待结果
    for future in concurrent.futures.as_completed(future_to_task):
        task_id = future_to_task[future]
        try:
            result = future.result()
            print(f"任务 {result[0]} 完成，耗时 {result[1]} 秒")
        except Exception as e:
            print(f"任务 {task_id} 发生异常: {e}")

8.6.2 多进程

对于CPU密集型任务，多进程可以绕过GIL，充分利用多核CPU。

python 复制代码

import multiprocessing
import time
import concurrent.futures

def cpu_intensive_task(n):
    """CPU密集型任务：计算斐波那契数列"""
    def fibonacci(x):
        if x <= 1:
            return x
        return fibonacci(x-1) + fibonacci(x-2)
    return fibonacci(n)

# 基本多进程
if __name__ == '__main__':  # Windows下必须使用if __name__ == '__main__'
    processes = []
    numbers = [30, 31, 32, 33]
    
    start_time = time.time()
    
    # 使用多进程
    for num in numbers:
        p = multiprocessing.Process(target=cpu_intensive_task, args=(num,))
        processes.append(p)
        p.start()
    
    for p in processes:
        p.join()
    
    multi_time = time.time() - start_time
    print(f"多进程耗时: {multi_time:.2f}秒")
    
    # 串行执行进行比较
    start_time = time.time()
    for num in numbers:
        cpu_intensive_task(num)
    serial_time = time.time() - start_time
    print(f"串行耗时: {serial_time:.2f}秒")
    print(f"加速比: {serial_time / multi_time:.2f}倍")

    # 使用进程池
    with concurrent.futures.ProcessPoolExecutor() as executor:
        future_to_num = {executor.submit(cpu_intensive_task, num): num for num in numbers}
        
        for future in concurrent.futures.as_completed(future_to_num):
            num = future_to_num[future]
            try:
                result = future.result()
                print(f"斐波那契({num}) = {result}")
            except Exception as e:
                print(f"计算斐波那契({num})时出错: {e}")

8.6.3 异步编程

异步编程使用async/await语法，适用于I/O密集型任务，可以在单线程中实现并发。

python 复制代码

import asyncio
import aiohttp
import time

async def fetch_url(session, url):
    """异步获取URL内容"""
    async with session.get(url) as response:
        return await response.text()

async def main():
    urls = [
        'https://www.python.org',
        'https://docs.python.org',
        'https://pypi.org',
        'https://github.com/python'
    ]
    
    async with aiohttp.ClientSession() as session:
        tasks = [fetch_url(session, url) for url in urls]
        results = await asyncio.gather(*tasks)
        
        for i, (url, content) in enumerate(zip(urls, results)):
            print(f"URL {i+1}: {url}")
            print(f"内容长度: {len(content)} 字节")

# 基本异步函数示例
async def say_after(delay, what):
    await asyncio.sleep(delay)
    print(what)

async def basic_async_example():
    print(f"开始时间: {time.strftime('%H:%M:%S')}")
    
    # 并发执行两个任务
    task1 = asyncio.create_task(say_after(1, "Hello"))
    task2 = asyncio.create_task(say_after(2, "World"))
    
    await task1
    await task2
    
    print(f"结束时间: {time.strftime('%H:%M:%S')}")

if __name__ == '__main__':
    # 运行基本异步示例
    asyncio.run(basic_async_example())
    
    # 运行HTTP请求示例
    try:
        asyncio.run(main())
    except aiohttp.ClientError as e:
        print(f"HTTP请求出错: {e}")

8.7 Python最佳实践

8.7.1 代码风格（PEP 8）

PEP 8是Python的官方代码风格指南，遵循这些规范可以提高代码的可读性和一致性。

python 复制代码

# 缩进：使用4个空格
# 行长度：不超过79个字符
# 导入顺序：标准库 -> 第三方库 -> 本地模块
import os
import sys
from datetime import datetime

import requests

from mymodule import my_function

# 函数和变量命名：使用小写字母和下划线
def calculate_average(numbers):
    """计算平均值"""
    if not numbers:
        return 0
    return sum(numbers) / len(numbers)

# 类命名：使用驼峰命名法
class DataProcessor:
    def process(self, data):
        pass

# 常量：使用大写字母和下划线
MAX_RETRY_ATTEMPTS = 5
DEFAULT_TIMEOUT = 30

8.7.2 代码文档

良好的文档可以帮助他人理解和使用你的代码。

python 复制代码

def calculate_area(radius):
    """计算圆的面积
    
    参数:
        radius (float): 圆的半径
    
    返回:
        float: 圆的面积
    
    异常:
        ValueError: 当半径为负数时抛出
    """
    if radius < 0:
        raise ValueError("半径不能为负数")
    return 3.14159 * radius ** 2

class Rectangle:
    """表示矩形的类
    
    属性:
        width (float): 矩形的宽度
        height (float): 矩形的高度
    """
    
    def __init__(self, width, height):
        """初始化矩形
        
        参数:
            width (float): 矩形的宽度
            height (float): 矩形的高度
        """
        self.width = width
        self.height = height
    
    def area(self):
        """计算矩形的面积
        
        返回:
            float: 矩形的面积
        """
        return self.width * self.height

8.7.3 测试

编写测试可以确保代码的正确性，并在修改后快速发现问题。

python 复制代码

# 使用unittest模块
import unittest

def add(a, b):
    return a + b

def subtract(a, b):
    return a - b

class TestMathFunctions(unittest.TestCase):
    def test_add(self):
        self.assertEqual(add(1, 2), 3)
        self.assertEqual(add(-1, 1), 0)
        self.assertEqual(add(0, 0), 0)
    
    def test_subtract(self):
        self.assertEqual(subtract(5, 3), 2)
        self.assertEqual(subtract(3, 5), -2)
        self.assertEqual(subtract(0, 0), 0)

if __name__ == '__main__':
    unittest.main()

# 使用pytest（需要安装）
"""
def multiply(a, b):
    return a * b

def test_multiply():
    assert multiply(2, 3) == 6
    assert multiply(-2, 3) == -6
    assert multiply(0, 5) == 0
"""

8.7.4 错误处理

良好的错误处理可以使程序更加健壮。

python 复制代码

# 使用try-except捕获异常
def divide(a, b):
    try:
        result = a / b
    except ZeroDivisionError:
        print("错误：除数不能为零")
        return None
    except TypeError:
        print("错误：请提供数字类型")
        return None
    else:
        return result
    finally:
        print("除法操作完成")

print(divide(10, 2))
print(divide(10, 0))
print(divide(10, 'a'))

# 自定义异常
class InvalidInputError(Exception):
    """当输入无效时抛出的异常"""
    pass

def validate_email(email):
    if '@' not in email:
        raise InvalidInputError(f"无效的邮箱地址: {email}")
    return True

try:
    validate_email("user@example.com")
    validate_email("invalid-email")
except InvalidInputError as e:
    print(f"验证失败: {e}")

8.7.5 性能优化

Python性能优化的一些技巧。

python 复制代码

# 使用局部变量而非全局变量
def slow_function():
    result = 0
    for i in range(1000000):
        result += i  # 访问全局函数
    return result

def fast_function():
    result = 0
    add = result.__iadd__  # 使用局部变量
    for i in range(1000000):
        add(i)
    return result

# 使用列表推导式而非循环
# 慢
def create_list_slow():
    result = []
    for i in range(10000):
        result.append(i * 2)
    return result

# 快
def create_list_fast():
    return [i * 2 for i in range(10000)]

# 使用生成器表达式处理大数据
# 内存友好
def process_large_data():
    return sum(i for i in range(1000000))

# 使用内置函数和标准库
def count_words(text):
    # 使用标准库的collections.Counter
    from collections import Counter
    words = text.split()
    return Counter(words)

8.7.6 内存管理

Python的内存管理主要由垃圾收集器负责，但我们也可以采取一些措施来优化内存使用。

python 复制代码

# 及时删除不再使用的大对象
import gc

def process_large_file():
    # 读取大文件
    with open('large_file.txt', 'r') as f:
        data = f.read()  # 加载整个文件到内存
    
    # 处理数据
    result = process_data(data)
    
    # 及时删除大对象
    del data
    gc.collect()  # 强制垃圾回收
    
    return result

# 使用生成器处理大数据

def read_large_file(file_path):
    """逐行读取文件，节省内存"""
    with open(file_path, 'r') as f:
        for line in f:
            yield line.strip()

# 避免创建不必要的副本
def process_list(items):
    # 不要这样做：创建了不必要的副本
    # new_items = items[:]  # 复制整个列表
    
    # 直接处理原列表
    for i in range(len(items)):
        items[i] = items[i] * 2
    return items

# 使用弱引用避免内存泄漏
import weakref

class LargeObject:
    def __init__(self, name):
        self.name = name
        self.data = bytearray(1024 * 1024)  # 1MB的数据

large_obj = LargeObject("test")
weak_ref = weakref.ref(large_obj)

print("large_obj存在:", weak_ref() is not None)
del large_obj
print("large_obj被删除后:", weak_ref() is not None)

8.7.7 代码组织

良好的代码组织可以提高代码的可维护性。

python 复制代码

# 模块结构示例
"""
myproject/
├── __init__.py
├── config.py           # 配置文件
├── utils/              # 工具函数
│   ├── __init__.py
│   ├── helpers.py
│   └── validators.py
├── models/             # 数据模型
│   ├── __init__.py
│   ├── user.py
│   └── product.py
├── services/           # 业务逻辑
│   ├── __init__.py
│   ├── auth_service.py
│   └── product_service.py
└── tests/              # 测试文件
    ├── __init__.py
    ├── test_utils.py
    └── test_models.py
"""

# 导入模块的最佳实践

# 不推荐
# from module import *

# 推荐
from module import specific_function
import module

# 使用__all__控制导入
def func1():
    pass

def func2():
    pass

__all__ = ['func1']  # 只允许导入func1

8.8 总结

Python是一种功能强大且灵活的编程语言，其高级特性使得开发者可以编写更加优雅、高效的代码。通过掌握生成器、装饰器、上下文管理器、元编程、函数式编程和并发编程等高级特性，你可以解决更复杂的问题，并提高代码的质量和性能。

同时，遵循Python的最佳实践，如代码风格、文档、测试、错误处理和性能优化等，可以使你的代码更加可读、可维护和健壮。无论是初学者还是有经验的开发者，不断学习和实践这些高级特性和最佳实践，都将使你成为更优秀的Python程序员。

通过本教程的学习，你已经掌握了Python编程的基础知识和高级特性，可以开始编写自己的Python应用程序了。记住，实践是学习编程的最佳方式，不断编写代码、解决问题，你的Python编程技能将会不断提高。