数据库操作

第5步:Python 数据库操作

建议时间:1-2 周 | 目标:掌握 Python 操作 MySQL/PostgreSQL、复杂查询、增删改、连接池


5.1 数据库连接基础

概念:Python 数据库连接

Python 通过数据库驱动连接数据库。MySQL 常用驱动有 pymysql(同步)和 aiomysql(异步),PostgreSQL 常用 psycopg2(同步)和 asyncpg(异步)。

pymysql 连接 MySQL

概念 :pymysql 是纯 Python 实现的 MySQL 驱动,无需编译,适合所有平台。with connection 自动管理事务(成功时 commit,异常时 rollback),但不会关闭连接;with connection.cursor() 自动关闭游标。

bash 复制代码
pip install pymysql
python 复制代码
import pymysql

# 创建连接
connection = pymysql.connect(
    host="localhost",
    port=3306,
    user="root",
    password="your_password",
    database="myapp",
    charset="utf8mb4"
)

# 使用 with 自动管理连接
with connection:
    with connection.cursor() as cursor:
        # 执行查询
        cursor.execute("SELECT * FROM users WHERE id = %s", (1,))
        result = cursor.fetchone()
        print(result)

    # 增删改需要提交
    with connection.cursor() as cursor:
        cursor.execute("INSERT INTO users (username, email) VALUES (%s, %s)", ("alice", "alice@example.com"))
        connection.commit()

psycopg2 连接 PostgreSQL

概念:psycopg2 是 PostgreSQL 的主流 Python 驱动,性能高,支持预处理语句和参数化查询。使用方式与 pymysql 类似。

bash 复制代码
pip install psycopg2-binary
python 复制代码
import psycopg2

connection = psycopg2.connect(
    host="localhost",
    port=5432,
    user="postgres",
    password="your_password",
    database="myapp"
)

with connection:
    with connection.cursor() as cursor:
        cursor.execute("SELECT * FROM users WHERE id = %s", (1,))
        result = cursor.fetchone()
        print(result)

连接参数说明

概念:连接参数用于指定数据库服务器的位置、认证信息和默认数据库。正确配置参数是成功连接数据库的前提。

参数 说明 示例
host 数据库地址 localhost / 127.0.0.1
port 端口号 3306 (MySQL) / 5432 (PostgreSQL)
user 用户名 root / postgres
password 密码 your_password
database 数据库名 myapp
charset 字符集 utf8mb4 (MySQL)

5.2 增删改操作

概念:增删改 (INSERT/UPDATE/DELETE)

增删改操作会修改数据库内容,需要调用 connection.commit() 提交事务。

INSERT 插入数据

概念:INSERT 用于向表中添加新记录。cursor.lastrowid 可以获取自增主键的值,executemany 可以批量插入多条数据。

python 复制代码
import pymysql

connection = pymysql.connect(host="localhost", user="root", password="pwd", database="myapp")

with connection:
    with connection.cursor() as cursor:
        # 插入单条
        sql = "INSERT INTO users (username, email, age) VALUES (%s, %s, %s)"
        cursor.execute(sql, ("alice", "alice@example.com", 25))
        connection.commit()
        print(f"插入ID: {cursor.lastrowid}")

    with connection.cursor() as cursor:
        # 批量插入
        sql = "INSERT INTO users (username, email) VALUES (%s, %s)"
        data = [
            ("bob", "bob@example.com"),
            ("charlie", "charlie@example.com"),
            ("dave", "dave@example.com")
        ]
        cursor.executemany(sql, data)
        connection.commit()
        print(f"批量插入 {cursor.rowcount} 条")

UPDATE 更新数据

概念:UPDATE 用于修改表中已存在的记录。cursor.rowcount 返回受影响的行数,可用于判断更新是否成功。

python 复制代码
with connection.cursor() as cursor:
    # 更新单条
    sql = "UPDATE users SET age = %s WHERE username = %s"
    cursor.execute(sql, (26, "alice"))
    connection.commit()
    print(f"更新了 {cursor.rowcount} 条")

with connection.cursor() as cursor:
    # 批量更新
    sql = "UPDATE users SET is_active = %s WHERE created_at < %s"
    cursor.execute(sql, (False, "2024-01-01"))
    connection.commit()
    print(f"更新了 {cursor.rowcount} 条")

DELETE 删除数据

概念:DELETE 用于删除表中符合条件的记录。删除操作要谨慎,通常要带 WHERE 条件避免误删所有数据。

python 复制代码
with connection.cursor() as cursor:
    # 删除单条
    sql = "DELETE FROM users WHERE id = %s"
    cursor.execute(sql, (1,))
    connection.commit()
    print(f"删除了 {cursor.rowcount} 条")

with connection.cursor() as cursor:
    # 批量删除
    sql = "DELETE FROM users WHERE is_active = %s"
    cursor.execute(sql, (False,))
    connection.commit()
    print(f"删除了 {cursor.rowcount} 条")

事务的回滚

概念:事务回滚用于撤销已执行的操作。当发生错误时调用 connection.rollback(),可以回到事务开始前的状态。

python 复制代码
with connection.cursor() as cursor:
    try:
        cursor.execute("INSERT INTO users (username) VALUES ('test')")
        cursor.execute("UPDATE users SET age = 100 WHERE username = 'nonexistent'")
        connection.commit()
    except Exception as e:
        connection.rollback()  # 回滚事务
        print(f"事务回滚: {e}")

5.3 查询操作

概念:查询 (SELECT)

查询操作不会修改数据,不需要 commit。fetchone() 获取一条,fetchall() 获取全部,fetchmany(n) 获取 n 条。

基础查询

概念:基础查询使用 fetch 系列方法获取结果。fetchone() 返回单行(元组或字典),fetchall() 返回所有行,fetchmany(n) 返回 n 行。

python 复制代码
with connection.cursor(pymysql.cursors.DictCursor) as cursor:
    # 查询所有
    cursor.execute("SELECT * FROM users")
    users = cursor.fetchall()
    for user in users:
        print(user)

with connection.cursor() as cursor:
    # 查询单条
    cursor.execute("SELECT * FROM users WHERE id = %s", (1,))
    user = cursor.fetchone()
    print(user)

with connection.cursor() as cursor:
    # 查询多条(限制数量)
    cursor.execute("SELECT * FROM users LIMIT %s", (10,))
    users = cursor.fetchmany(5)
    for user in users:
        print(user)

条件查询

概念:条件查询通过 WHERE 子句筛选数据。支持 AND/OR 组合、LIKE 模糊匹配、IN 列表匹配、BETWEEN 范围查询。

python 复制代码
with connection.cursor() as cursor:
    # AND 条件
    cursor.execute(
        "SELECT * FROM users WHERE age >= %s AND is_active = %s",
        (18, True)
    )
    users = cursor.fetchall()

    # OR 条件
    cursor.execute(
        "SELECT * FROM users WHERE username = %s OR email = %s",
        ("alice", "alice@example.com")
    )
    users = cursor.fetchall()

    # LIKE 模糊查询
    cursor.execute(
        "SELECT * FROM users WHERE username LIKE %s",
        ("a%",)  # 以 a 开头的用户名
    )
    users = cursor.fetchall()

    # IN 查询(动态生成占位符)
    user_ids = [1, 2, 3]
    placeholders = ", ".join(["%s"] * len(user_ids))
    cursor.execute(
        f"SELECT * FROM users WHERE id IN ({placeholders})",
        tuple(user_ids)
    )
    users = cursor.fetchall()

    # BETWEEN 范围查询
    cursor.execute(
        "SELECT * FROM users WHERE age BETWEEN %s AND %s",
        (18, 30)
    )
    users = cursor.fetchall()

排序和分页

概念:ORDER BY 用于对结果排序,LIMIT/OFFSET 用于分页。分页计算公式:offset = (page - 1) * page_size。

python 复制代码
with connection.cursor() as cursor:
    # 排序
    cursor.execute("SELECT * FROM users ORDER BY age DESC")
    users = cursor.fetchall()

    cursor.execute("SELECT * FROM users ORDER BY created_at ASC, age DESC")
    users = cursor.fetchall()

    # 分页查询
    page = 2  # 第2页
    page_size = 10  # 每页10条
    offset = (page - 1) * page_size

    cursor.execute(
        "SELECT * FROM users ORDER BY id LIMIT %s OFFSET %s",
        (page_size, offset)
    )
    users = cursor.fetchall()

5.4 复杂查询

概念:复杂查询

复杂查询包括聚合统计、分组、JOIN 连表、子查询等高级查询技巧。

聚合统计

概念:聚合函数对一组值执行计算并返回单一值。COUNT 计数、SUM 求和、AVG 平均、MAX 最大、MIN 最小。

python 复制代码
with connection.cursor() as cursor:
    # COUNT 统计数量
    cursor.execute("SELECT COUNT(*) FROM users")
    count = cursor.fetchone()[0]
    print(f"用户总数: {count}")

    # COUNT 带条件
    cursor.execute("SELECT COUNT(*) FROM users WHERE is_active = %s", (True,))
    active_count = cursor.fetchone()[0]

    # SUM 求和
    cursor.execute("SELECT SUM(price) FROM orders WHERE user_id = %s", (1,))
    total = cursor.fetchone()[0] or 0

    # AVG 平均值
    cursor.execute("SELECT AVG(age) FROM users")
    avg_age = cursor.fetchone()[0]

    # MAX 最大值 / MIN 最小值
    cursor.execute("SELECT MAX(price), MIN(price) FROM products")
    max_price, min_price = cursor.fetchone()

分组查询 GROUP BY

概念:GROUP BY 按一个或多个列分组,配合聚合函数实现分类统计。HAVING 用于过滤分组后的结果。

python 复制代码
with connection.cursor() as cursor:
    # 按分组统计
    sql = """
        SELECT category, COUNT(*) as count, AVG(price) as avg_price
        FROM products
        GROUP BY category
        HAVING COUNT(*) > 5
        ORDER BY count DESC
    """
    cursor.execute(sql)
    results = cursor.fetchall()
    for row in results:
        print(f"分类: {row[0]}, 数量: {row[1]}, 平均价格: {row[2]}")

JOIN 连表查询

概念:JOIN 用于连接多个表获取关联数据。INNER JOIN 只保留两边匹配的记录,LEFT JOIN 保留左边所有记录,多表 JOIN 可连接多个相关表。

python 复制代码
# 表结构: users (id, username), orders (id, user_id, total), order_items (id, order_id, product_id, quantity)

with connection.cursor() as cursor:
    # INNER JOIN - 两表都有的记录
    sql = """
        SELECT u.username, o.id, o.total, o.created_at
        FROM users u
        INNER JOIN orders o ON u.id = o.user_id
        WHERE o.total > 100
        ORDER BY o.created_at DESC
    """
    cursor.execute(sql)
    results = cursor.fetchall()

    # LEFT JOIN - 保留左边所有记录
    sql = """
        SELECT u.username, COUNT(o.id) as order_count, COALESCE(SUM(o.total), 0) as total_spent
        FROM users u
        LEFT JOIN orders o ON u.id = o.user_id
        GROUP BY u.id, u.username
        HAVING COUNT(o.id) > 0
    """
    cursor.execute(sql)
    results = cursor.fetchall()

    # 多表 JOIN
    sql = """
        SELECT u.username, o.id as order_id, p.name as product_name, oi.quantity
        FROM orders o
        INNER JOIN users u ON o.user_id = u.id
        INNER JOIN order_items oi ON o.id = oi.order_id
        INNER JOIN products p ON oi.product_id = p.id
        WHERE o.id = %s
    """
    cursor.execute(sql, (1,))
    items = cursor.fetchall()

子查询

概念:子查询是嵌套在另一个查询中的查询。WHERE 型子查询用于条件判断,IN/EXISTS 用于存在性检查,FROM 型子查询作为临时表。

python 复制代码
with connection.cursor() as cursor:
    # WHERE 子查询 - 查询价格高于平均的产品
    sql = """
        SELECT * FROM products
        WHERE price > (SELECT AVG(price) FROM products)
    """
    cursor.execute(sql)
    expensive_products = cursor.fetchall()

    # IN 子查询 - 查询有订单的用户
    sql = """
        SELECT * FROM users
        WHERE id IN (SELECT DISTINCT user_id FROM orders)
    """
    cursor.execute(sql)
    users_with_orders = cursor.fetchall()

    # EXISTS 子查询 - 查询有产品的分类
    sql = """
        SELECT * FROM categories c
        WHERE EXISTS (
            SELECT 1 FROM products p WHERE p.category_id = c.id
        )
    """
    cursor.execute(sql)
    categories_with_products = cursor.fetchall()

    # FROM 子查询 - 分组统计后再过滤
    sql = """
        SELECT * FROM (
            SELECT category_id, COUNT(*) as cnt, AVG(price) as avg_price
            FROM products
            GROUP BY category_id
        ) AS stats
        WHERE cnt > 10
    """
    cursor.execute(sql)
    result = cursor.fetchall()

UNION 合并查询

概念:UNION 合并两个查询的结果集,自动去重;UNION ALL 不去重但性能更好。合并的查询必须有相同的列数和数据类型。

python 复制代码
with connection.cursor() as cursor:
    # UNION - 合并结果并去重
    sql = """
        SELECT username FROM users WHERE is_active = 1
        UNION
        SELECT username FROM admin_users
    """
    cursor.execute(sql)
    active_usernames = cursor.fetchall()

    # UNION ALL - 不去重
    sql = """
        SELECT 'user' as type, username FROM users
        UNION ALL
        SELECT 'admin' as type, username FROM admin_users
    """
    cursor.execute(sql)
    all_users = cursor.fetchall()

5.5 数据库连接池

概念:连接池

连接池预先创建一定数量的数据库连接,使用时从池中获取,使用完毕后归还。避免频繁创建销毁连接,提升性能和资源利用率。

pymysqlpool(同步连接池)

概念:pymysqlpool 是基于 pymysql 的同步连接池实现,配合 DBUtils 库使用。适合 Flask/Django 等同步 Web 框架。

bash 复制代码
pip install pymysql
# 或者使用 DBUtils
pip install dbutils
python 复制代码
from dbutils.pooled_db import PooledDB
import pymysql

# 创建连接池
pool = PooledDB(
    creator=pymysql,  # 使用 pymysql
    maxconnections=20,  # 最大连接数
    mincached=5,  # 初始化时创建的空闲连接数
    maxcached=10,  # 最多空闲连接数
    blocking=True,  # 连接用完时是否阻塞等待
    maxusage=None,  # 单个连接最大使用次数
    setsession=[],  # 连接前执行的 SQL 语句
    ping=1,  # 检测连接活性频率
    host="localhost",
    port=3306,
    user="root",
    password="pwd",
    database="myapp",
    charset="utf8mb4"
)

# 使用连接
def query_users():
    conn = pool.connection()  # 从池中获取连接
    try:
        with conn.cursor() as cursor:
            cursor.execute("SELECT * FROM users")
            return cursor.fetchall()
    finally:
        conn.close()  # 归还连接到池中

def insert_user(username, email):
    conn = pool.connection()
    try:
        with conn.cursor() as cursor:
            cursor.execute("INSERT INTO users (username, email) VALUES (%s, %s)", (username, email))
            conn.commit()
            return cursor.lastrowid
    finally:
        conn.close()

# 批量操作
def batch_insert_users(users):
    conn = pool.connection()
    try:
        with conn.cursor() as cursor:
            sql = "INSERT INTO users (username, email) VALUES (%s, %s)"
            cursor.executemany(sql, users)
            conn.commit()
    finally:
        conn.close()

aiomysql(异步连接池)

概念:aiomysql 是异步 MySQL 驱动,支持 async/await。适合 FastAPI/Quart 等异步 Web 框架,配合 asyncio.gather 实现高并发。

bash 复制代码
pip install aiomysql
python 复制代码
import asyncio
import aiomysql

async def create_pool():
    pool = await aiomysql.create_pool(
        host="localhost",
        port=3306,
        user="root",
        password="pwd",
        db="myapp",
        minsize=5,
        maxsize=20,
        charset="utf8mb4"
    )
    return pool

async def query_users(pool):
    async with pool.acquire() as conn:
        async with conn.cursor(aiomysql.DictCursor) as cursor:
            await cursor.execute("SELECT * FROM users")
            return await cursor.fetchall()

async def insert_user(pool, username, email):
    async with pool.acquire() as conn:
        async with conn.cursor() as cursor:
            await cursor.execute(
                "INSERT INTO users (username, email) VALUES (%s, %s)",
                (username, email)
            )
            await conn.commit()
            return cursor.lastrowid

async def main():
    pool = await create_pool()
    try:
        # 查询
        users = await query_users(pool)
        print(users)

        # 插入
        user_id = await insert_user(pool, "alice", "alice@example.com")
        print(f"插入ID: {user_id}")

        # 批量查询(并发)
        tasks = [query_users(pool) for _ in range(10)]
        results = await asyncio.gather(*tasks)
    finally:
        pool.close()
        await pool.wait_closed()

asyncio.run(main())

SQLAlchemy 连接池

概念:SQLAlchemy 引擎内置连接池功能,pool_size 控制池大小,max_overflow 控制溢出连接数,pool_pre_ping 检测失效连接。

bash 复制代码
pip install sqlalchemy pymysql
python 复制代码
from sqlalchemy import create_engine, text

# 创建引擎(默认带连接池)
engine = create_engine(
    "mysql+pymysql://user:pwd@localhost/myapp",
    pool_size=10,           # 池中连接数
    max_overflow=20,        # 超出 pool_size 的最大连接数
    pool_recycle=3600,      # 连接回收时间(秒)
    pool_pre_ping=True,     # 使用前检测连接
    echo=False              # 是否打印 SQL
)

# 使用连接
with engine.connect() as conn:
    result = conn.execute(text("SELECT * FROM users"))
    users = result.fetchall()

# 事务操作
with engine.connect() as conn:
    with conn.begin():
        conn.execute(text("INSERT INTO users (username) VALUES (:username)"), {"username": "alice"})

连接池参数说明

概念:根据应用场景和数据库服务器配置,合理设置连接池参数可提高性能和资源利用率。

参数 说明 推荐值
pool_size 池中连接数 5-20
max_overflow 最大扩展连接数 10-30
pool_recycle 连接回收时间(秒) 3600
pool_pre_ping 使用前检测连接 True
minsize 最小连接数 2-5
maxsize 最大连接数 10-50

连接池使用场景

python 复制代码
# 场景1: Web 应用(请求结束归还连接)
pool = PooledDB(creator=pymysql, maxconnections=20, ...)

def handle_request(user_id):
    conn = pool.connection()
    try:
        with conn.cursor() as cursor:
            cursor.execute("SELECT * FROM users WHERE id = %s", (user_id,))
            return cursor.fetchone()
    finally:
        conn.close()

# 场景2: 定时任务(批量处理)
def daily_report():
    conn = pool.connection()
    try:
        with conn.cursor() as cursor:
            cursor.execute("SELECT COUNT(*) FROM orders WHERE DATE(created_at) = CURDATE()")
            return cursor.fetchone()
    finally:
        conn.close()

# 场景3: 异步 Web(使用 aiomysql 池)
async def async_query(pool):
    async with pool.acquire() as conn:
        async with conn.cursor() as cursor:
            await cursor.execute("SELECT * FROM users")
            return await cursor.fetchall()

5.6 ORM 操作

概念:ORM

ORM(Object-Relational Mapping)将数据库表映射为 Python 类,用面向对象方式操作数据库。核心优势:无需手写 SQL、类型安全、代码可维护性高、数据库切换成本低。


SQLAlchemy ORM

概念:SQLAlchemy 是 Python 最强大、最成熟的 ORM 库,分为 Core(SQL 表达式层)和 ORM(对象关系映射层)两层。支持完整的 SQL 表达式、连接池、事务管理和异步操作,是 FastAPI / Flask 等框架的首选 ORM。

bash 复制代码
pip install sqlalchemy pymysql
引擎与会话创建

引擎(Engine)是 SQLAlchemy 的入口,管理连接池和数据库方言;Session 是 ORM 操作的工作单元,所有增删改查都通过 Session 完成。

python 复制代码
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker, DeclarativeBase

engine = create_engine(
    "mysql+pymysql://root:password@localhost:3306/myapp?charset=utf8mb4",
    pool_size=10,
    max_overflow=20,
    pool_pre_ping=True,
    pool_recycle=3600,
    echo=False
)

SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)

class Base(DeclarativeBase):
    pass
定义模型(表结构映射)

使用 Base 子类定义 ORM 模型,__tablename__ 指定表名,Column 定义字段。支持主键、索引、唯一约束、外键、默认值、注释等完整约束。

注意default=datetime.now 传的是函数引用(每行插入时调用),default=datetime.now() 传的是固定值(模型定义时计算一次)。通常应传函数引用。

python 复制代码
from sqlalchemy import Column, Integer, String, Boolean, DateTime, Float, Text, Enum as SQLEnum
from sqlalchemy.orm import relationship
from datetime import datetime
import enum

class UserStatus(str, enum.Enum):
    ACTIVE = "active"
    INACTIVE = "inactive"
    BANNED = "banned"

class User(Base):
    __tablename__ = "users"

    id = Column(Integer, primary_key=True, autoincrement=True, comment="用户ID")
    username = Column(String(50), unique=True, nullable=False, index=True, comment="用户名")
    email = Column(String(100), unique=True, nullable=False, index=True, comment="邮箱")
    hashed_password = Column(String(255), nullable=False, comment="密码哈希")
    age = Column(Integer, default=0, comment="年龄")
    status = Column(SQLEnum(UserStatus), default=UserStatus.ACTIVE, comment="状态")
    is_superuser = Column(Boolean, default=False, comment="是否超级用户")
    created_at = Column(DateTime, default=datetime.now, comment="创建时间")
    updated_at = Column(DateTime, default=datetime.now, onupdate=datetime.now, comment="更新时间")

    orders = relationship("Order", back_populates="user", cascade="all, delete-orphan")

    def __repr__(self):
        return f"<User(id={self.id}, username='{self.username}')>"

class Order(Base):
    __tablename__ = "orders"

    id = Column(Integer, primary_key=True, autoincrement=True, comment="订单ID")
    order_no = Column(String(32), unique=True, nullable=False, index=True, comment="订单号")
    user_id = Column(Integer, ForeignKey("users.id", ondelete="CASCADE"), nullable=False, comment="用户ID")
    total_amount = Column(Float, nullable=False, default=0.0, comment="订单总额")
    shipping_address = Column(Text, nullable=True, comment="收货地址")
    created_at = Column(DateTime, default=datetime.now, comment="创建时间")

    user = relationship("User", back_populates="orders")

    def __repr__(self):
        return f"<Order(id={self.id}, order_no='{self.order_no}')>"
表结构操作(建表/删表/修改)

SQLAlchemy 通过 Base.metadata 管理表结构,支持创建表、删除表和检查表是否存在。注意:ORM 本身不支持直接修改已有表结构(如加列、改类型),表结构变更应使用 Alembic 迁移工具。

python 复制代码
from sqlalchemy import inspect

Base.metadata.create_all(engine)

Base.metadata.drop_all(engine)

inspector = inspect(engine)
table_names = inspector.get_table_names()
print("已有表:", table_names)

users_exists = inspector.has_table("users")
print("users 表是否存在:", users_exists)

columns = inspector.get_columns("users")
for col in columns:
    print(f"  字段: {col['name']}, 类型: {col['type']}, 可空: {col['nullable']}")

pk = inspector.get_pk_constraint("users")
print("主键:", pk)

fks = inspector.get_foreign_keys("orders")
for fk in fks:
    print(f"  外键: {fk['constrained_columns']} -> {fk['referred_table']}.{fk['referred_columns']}")

indexes = inspector.get_indexes("users")
for idx in indexes:
    print(f"  索引: {idx['name']}, 列: {idx['column_names']}, 唯一: {idx['unique']}")
模型关系详解

SQLAlchemy 支持一对一、一对多、多对多三种关系。relationship 定义 ORM 层面的导航属性,ForeignKey 定义数据库层面的外键约束,secondary 指定多对多中间表。

python 复制代码
from sqlalchemy import Table, Column, Integer, ForeignKey, String, DateTime
from sqlalchemy.orm import relationship
from datetime import datetime

user_roles = Table(
    "user_roles",
    Base.metadata,
    Column("user_id", Integer, ForeignKey("users.id", ondelete="CASCADE"), primary_key=True),
    Column("role_id", Integer, ForeignKey("roles.id", ondelete="CASCADE"), primary_key=True)
)

class Role(Base):
    __tablename__ = "roles"

    id = Column(Integer, primary_key=True, autoincrement=True)
    name = Column(String(50), unique=True, nullable=False, comment="角色名称")
    code = Column(String(50), unique=True, nullable=False, comment="角色代码")
    created_at = Column(DateTime, default=datetime.now)

    users = relationship("User", secondary=user_roles, back_populates="roles")

class User(Base):
    __tablename__ = "users"

    id = Column(Integer, primary_key=True, autoincrement=True)
    username = Column(String(50), unique=True, nullable=False)
    email = Column(String(100), unique=True, nullable=False)
    profile_id = Column(Integer, ForeignKey("profiles.id"), unique=True, comment="一对一关联")

    orders = relationship("Order", back_populates="user", cascade="all, delete-orphan")
    profile = relationship("Profile", back_populates="user", uselist=False)
    roles = relationship("Role", secondary=user_roles, back_populates="users")

class Profile(Base):
    __tablename__ = "profiles"

    id = Column(Integer, primary_key=True, autoincrement=True)
    bio = Column(String(500), nullable=True, comment="个人简介")
    avatar_url = Column(String(255), nullable=True, comment="头像URL")

    user = relationship("User", back_populates="profile")
python 复制代码
session = SessionLocal()

user = User(username="alice", email="alice@example.com")
role_admin = Role(name="管理员", code="admin")
role_editor = Role(name="编辑", code="editor")
user.roles.append(role_admin)
user.roles.append(role_editor)
session.add_all([user, role_admin, role_editor])
session.commit()

user = session.query(User).filter_by(username="alice").first()
print("用户的角色:", [r.name for r in user.roles])

role = session.query(Role).filter_by(code="admin").first()
print("角色下的用户:", [u.username for u in role.users])

profile = Profile(bio="Hello, I'm Alice", avatar_url="/avatars/alice.png")
user.profile = profile
session.commit()

user = session.query(User).filter_by(username="alice").first()
print("用户简介:", user.profile.bio)

user.roles.remove(role_editor)
session.commit()

session.close()
CRUD 操作详解

通过 Session 对象完成增删改查。add 添加单条,add_all 批量添加,query 查询,delete 删除,修改直接赋值后 commit

python 复制代码
session = SessionLocal()

try:
    user = User(username="alice", email="alice@example.com", age=25)
    session.add(user)
    session.commit()
    session.refresh(user)
    print(f"创建用户: id={user.id}, username={user.username}")

    users = [
        User(username="bob", email="bob@example.com", age=30),
        User(username="charlie", email="charlie@example.com", age=28),
        User(username="dave", email="dave@example.com", age=22),
    ]
    session.add_all(users)
    session.commit()

    user = session.query(User).filter_by(username="alice").first()
    user.age = 26
    session.commit()

    session.query(User).filter(User.username == "dave").update({"age": 23})
    session.commit()

    user = session.query(User).filter_by(username="bob").first()
    session.delete(user)
    session.commit()

    session.query(User).filter(User.age < 20).delete()
    session.commit()

finally:
    session.close()
查询进阶

SQLAlchemy 提供丰富的查询 API:filter/filter_by 条件过滤、join 连表、order_by 排序、limit/offset 分页、func 聚合、subquery 子查询。

python 复制代码
from sqlalchemy import func, or_, and_, desc, asc

session = SessionLocal()

try:
    user = session.query(User).filter_by(username="alice").first()
    user = session.query(User).filter(User.username == "alice").first()

    users = session.query(User).filter(User.age >= 18, User.age <= 30).all()
    users = session.query(User).filter(User.age.between(18, 30)).all()

    users = session.query(User).filter(
        or_(User.username == "alice", User.username == "bob")
    ).all()

    users = session.query(User).filter(
        and_(User.age >= 20, User.status == UserStatus.ACTIVE)
    ).all()

    users = session.query(User).filter(User.username.like("a%")).all()
    users = session.query(User).filter(User.username.ilike("%li%")).all()
    users = session.query(User).filter(User.username.in_(["alice", "bob"])).all()

    users = session.query(User).order_by(desc(User.created_at)).all()
    users = session.query(User).order_by(asc(User.age)).all()

    page = 2
    page_size = 10
    users = session.query(User).offset((page - 1) * page_size).limit(page_size).all()

    total = session.query(func.count(User.id)).scalar()
    avg_age = session.query(func.avg(User.age)).scalar()
    max_age = session.query(func.max(User.age)).scalar()

    results = session.query(
        User.username,
        func.count(Order.id).label("order_count"),
        func.sum(Order.total_amount).label("total_spent")
    ).join(Order, User.id == Order.user_id).group_by(User.id).all()

    for username, count, total in results:
        print(f"{username}: {count}笔订单, 共消费{total}")

    subq = session.query(
        Order.user_id,
        func.count(Order.id).label("cnt")
    ).group_by(Order.user_id).subquery()

    active_users = session.query(User).join(
        subq, User.id == subq.c.user_id
    ).filter(subq.c.cnt > 5).all()

    users_with_orders = session.query(User).join(Order).filter(
        Order.total_amount > 100
    ).distinct().all()

    users_without_orders = session.query(User).outerjoin(Order).filter(
        Order.id.is_(None)
    ).all()

    user_count = session.query(func.count(User.id)).scalar()
    active_count = session.query(func.count(User.id)).filter(
        User.status == UserStatus.ACTIVE
    ).scalar()

finally:
    session.close()
命名约定与 MetaData 配置

通过 MetaData 的 naming_convention 统一约束命名规则,使索引、唯一约束、外键等命名规范化,便于 Alembic 自动生成迁移脚本。

python 复制代码
from sqlalchemy import MetaData
from sqlalchemy.orm import DeclarativeBase

convention = {
    "ix": "ix_%(column_0_label)s",
    "uq": "uq_%(table_name)s_%(column_0_name)s",
    "ck": "ck_%(table_name)s_%(constraint_name)s",
    "fk": "fk_%(table_name)s_%(column_0_name)s_%(referred_table_name)s",
    "pk": "pk_%(table_name)s"
}

class Base(DeclarativeBase):
    metadata = MetaData(naming_convention=convention)
SQLAlchemy 2.0 新式写法(select)

SQLAlchemy 2.0 推荐使用 select() 替代 query(),语法更统一、类型提示更友好,也是 FastAPI 集成的推荐写法。

python 复制代码
from sqlalchemy import select, update, delete

session = SessionLocal()

try:
    stmt = select(User).where(User.username == "alice")
    user = session.scalars(stmt).first()

    stmt = select(User).where(User.age >= 18).order_by(User.created_at.desc())
    users = session.scalars(stmt).all()

    stmt = select(User.username, User.email).where(User.is_superuser == True)
    results = session.execute(stmt).all()

    stmt = (
        select(User.username, func.count(Order.id).label("order_count"))
        .join(Order)
        .group_by(User.id)
        .having(func.count(Order.id) > 3)
    )
    results = session.execute(stmt).all()

    stmt = (
        update(User)
        .where(User.status == UserStatus.INACTIVE)
        .values(status=UserStatus.ACTIVE)
    )
    session.execute(stmt)
    session.commit()

    stmt = delete(User).where(User.age < 18)
    session.execute(stmt)
    session.commit()

finally:
    session.close()

其他常用 ORM 库

Python 生态中除了 SQLAlchemy,还有多种 ORM 库适用于不同场景。下面对比主流选择:

ORM 库 适用场景 异步支持 特点
SQLAlchemy 大型项目、全功能需求 ✅ 2.0+ 最成熟、功能最全、社区最大
SQLModel FastAPI 项目 SQLAlchemy + Pydantic 融合,类型提示友好
Tortoise-ORM 异步项目(FastAPI/Sanic) ✅ 原生 Django 风格 API,异步优先
Peewee 小型项目、脚本工具 轻量简洁,API 类似 Django ORM
Django ORM Django 项目 Django 内置,与框架深度绑定
SQLModel

概念:SQLModel 由 FastAPI 作者 Sebastián Ramírez 开发,将 SQLAlchemy ORM 和 Pydantic 融为一体。一个类既是 ORM 模型又是 Pydantic Schema,减少重复定义,与 FastAPI 天然契合。

bash 复制代码
pip install sqlmodel
python 复制代码
from sqlmodel import SQLModel, Field, Session, create_engine, select
from typing import Optional
from datetime import datetime

engine = create_engine("mysql+pymysql://root:password@localhost/myapp")

class User(SQLModel, table=True):
    __tablename__ = "users"

    id: Optional[int] = Field(default=None, primary_key=True)
    username: str = Field(index=True, unique=True, max_length=50)
    email: str = Field(unique=True, max_length=100)
    age: Optional[int] = Field(default=0, ge=0, le=150)
    is_active: bool = Field(default=True)
    created_at: Optional[datetime] = Field(default_factory=datetime.now)

SQLModel.metadata.create_all(engine)

with Session(engine) as session:
    user = User(username="alice", email="alice@example.com", age=25)
    session.add(user)
    session.commit()
    session.refresh(user)

    statement = select(User).where(User.age >= 18)
    users = session.exec(statement).all()

    user.age = 26
    session.add(user)
    session.commit()

    session.delete(user)
    session.commit()
python 复制代码
from fastapi import FastAPI, Depends
from sqlmodel import Session

app = FastAPI()

def get_session():
    with Session(engine) as session:
        yield session

class UserCreate(SQLModel):
    username: str = Field(max_length=50)
    email: str = Field(max_length=100)
    age: Optional[int] = Field(default=0, ge=0, le=150)

class UserRead(SQLModel):
    id: int
    username: str
    email: str
    age: Optional[int] = 0
    is_active: bool = True
    created_at: Optional[datetime] = None

@app.post("/users", response_model=UserRead)
def create_user(user: UserCreate, session: Session = Depends(get_session)):
    db_user = User.model_validate(user)
    session.add(db_user)
    session.commit()
    session.refresh(db_user)
    return db_user

@app.get("/users", response_model=list[UserRead])
def list_users(session: Session = Depends(get_session)):
    return session.exec(select(User)).all()
Tortoise-ORM

概念:Tortoise-ORM 是受 Django ORM 启发的异步 ORM 库,API 风格简洁,原生支持 async/await,非常适合 FastAPI 等异步框架。

bash 复制代码
pip install tortoise-orm aiomysql
python 复制代码
from tortoise import Tortoise, fields
from tortoise.models import Model

class User(Model):
    id = fields.IntField(pk=True)
    username = fields.CharField(max_length=50, unique=True)
    email = fields.CharField(max_length=100, unique=True)
    age = fields.IntField(default=0)
    is_active = fields.BooleanField(default=True)
    created_at = fields.DatetimeField(auto_now_add=True)

    class Meta:
        table = "users"

    def __str__(self):
        return self.username

class Order(Model):
    id = fields.IntField(pk=True)
    user = fields.ForeignKeyField("models.User", related_name="orders")
    total = fields.FloatField(default=0.0)
    created_at = fields.DatetimeField(auto_now_add=True)

    class Meta:
        table = "orders"

async def init():
    await Tortoise.init(
        db_url="mysql://root:password@localhost:3306/myapp",
        modules={"models": ["__main__"]}
    )
    await Tortoise.generate_schemas()

async def main():
    await init()

    user = await User.create(username="alice", email="alice@example.com", age=25)

    user = await User.get(username="alice")
    user.age = 26
    await user.save()

    users = await User.filter(age__gte=18).all()

    count = await User.filter(is_active=True).count()

    await User.filter(username="alice").update(age=27)

    await User.filter(username="alice").delete()

import asyncio
asyncio.run(main())
python 复制代码
from fastapi import FastAPI
from tortoise.contrib.fastapi import register_tortoise

app = FastAPI()

register_tortoise(
    app,
    db_url="mysql://root:password@localhost:3306/myapp",
    modules={"models": ["models"]},
    generate_schemas=True,
    add_exception_handlers=True
)

@app.get("/users")
async def list_users():
    return await User.all()
Peewee

概念:Peewee 是一个轻量级同步 ORM,API 简洁类似 Django ORM,适合小型项目和脚本工具。无异步支持,不适合高并发异步场景。

bash 复制代码
pip install peewee pymysql
python 复制代码
from peewee import (
    Model, MySQLDatabase, CharField, IntegerField,
    BooleanField, DateTimeField, FloatField, ForeignKeyField
)
from datetime import datetime

db = MySQLDatabase("myapp", user="root", password="password", host="localhost", port=3306)

class BaseModel(Model):
    class Meta:
        database = db

class User(BaseModel):
    username = CharField(unique=True, max_length=50)
    email = CharField(unique=True, max_length=100)
    age = IntegerField(default=0)
    is_active = BooleanField(default=True)
    created_at = DateTimeField(default=datetime.now)

class Order(BaseModel):
    user = ForeignKeyField(User, backref="orders")
    total = FloatField(default=0.0)
    created_at = DateTimeField(default=datetime.now)

db.connect()
db.create_tables([User, Order])

user = User.create(username="alice", email="alice@example.com", age=25)

user = User.get(User.username == "alice")
user.age = 26
user.save()

users = User.select().where(User.age >= 18)

count = User.select().where(User.is_active == True).count()

User.update(age=27).where(User.username == "alice").execute()

user = User.get(User.username == "alice")
user.delete_instance()

db.close()
Django ORM

概念:Django ORM 是 Django 框架内置的 ORM,与框架深度绑定,只能在 Django 项目中使用。API 成熟稳定,迁移系统完善,但不适合独立使用。

python 复制代码
from django.db import models

class User(models.Model):
    username = models.CharField(max_length=50, unique=True)
    email = models.EmailField(unique=True)
    age = models.IntegerField(default=0)
    is_active = models.BooleanField(default=True)
    created_at = models.DateTimeField(auto_now_add=True)

    class Meta:
        db_table = "users"

class Order(models.Model):
    user = models.ForeignKey(User, on_delete=models.CASCADE, related_name="orders")
    total = models.FloatField(default=0.0)
    created_at = models.DateTimeField(auto_now_add=True)

    class Meta:
        db_table = "orders"
python 复制代码
user = User.objects.create(username="alice", email="alice@example.com", age=25)

user = User.objects.get(username="alice")
user.age = 26
user.save()

users = User.objects.filter(age__gte=18, is_active=True)

count = User.objects.filter(is_active=True).count()

User.objects.filter(username="alice").update(age=27)

User.objects.filter(username="alice").delete()

ORM 选型建议

场景 推荐 ORM 理由
FastAPI 项目(中大型) SQLAlchemy 2.0 功能最全,生态最成熟,异步支持完善
FastAPI 项目(快速开发) SQLModel Pydantic 融合,减少重复代码,上手快
FastAPI 项目(异步优先) Tortoise-ORM 原生异步,Django 风格 API 简洁
Django 项目 Django ORM 框架内置,迁移系统完善
小脚本/工具 Peewee 轻量,无额外依赖,API 简单
需要复杂 SQL SQLAlchemy Core SQL 表达式层,灵活度最高

5.7 数据库事务深入

概念:事务

事务是数据库中一组原子性的操作,确保 ACID 特性:原子性(Atomicity)、一致性(Consistency)、隔离性(Isolation)、持久性(Durability)。

事务隔离级别

概念:隔离级别决定事务间的隔离程度。MySQL 支持 4 种隔离级别:READ UNCOMMITTED、READ COMMITTED、REPEATABLE READ(默认)、SERIALIZABLE。

python 复制代码
import pymysql

connection = pymysql.connect(
    host="localhost",
    user="root",
    password="pwd",
    database="myapp",
    autocommit=False  # 手动控制事务
)

with connection:
    # 设置隔离级别
    with connection.cursor() as cursor:
        cursor.execute("SET TRANSACTION ISOLATION LEVEL READ COMMITTED")

    # 开启事务
    connection.begin()

    try:
        with connection.cursor() as cursor:
            # 转账操作
            cursor.execute("UPDATE accounts SET balance = balance - 100 WHERE user_id = 1")
            cursor.execute("UPDATE accounts SET balance = balance + 100 WHERE user_id = 2")

        # 提交事务
        connection.commit()
    except Exception as e:
        # 回滚事务
        connection.rollback()
        print(f"事务回滚: {e}")

事务隔离级别说明

隔离级别 脏读 不可重复读 幻读
READ UNCOMMITTED 可能 可能 可能
READ COMMITTED 不可能 可能 可能
REPEATABLE READ 不可能 不可能 可能
SERIALIZABLE 不可能 不可能 不可能

Savepoint 保存点

概念:Savepoint 允许在事务中创建中间点,可以只回滚到指定保存点而不是整个事务。适合复杂事务中的部分回滚需求。保存点名称应使用合法标识符,避免拼接用户输入。

python 复制代码
with connection:
    connection.begin()

    try:
        with connection.cursor() as cursor:
            cursor.execute("INSERT INTO users (username) VALUES ('a')")
            cursor.execute("SAVEPOINT sp1")

            cursor.execute("INSERT INTO users (username) VALUES ('b')")
            cursor.execute("ROLLBACK TO SAVEPOINT sp1")

        connection.commit()
    except Exception as e:
        connection.rollback()

5.8 SQL 注入防护

概念:SQL 注入

SQL 注入是一种常见攻击手段,攻击者通过在输入中插入恶意 SQL 代码来操作数据库。防护重点是永远不要拼接 SQL 字符串,始终使用参数化查询。

✅ 正确做法:参数化查询

概念:参数化查询使用占位符(%s)传递参数,驱动程序会自动转义特殊字符,防止 SQL 注入攻击。这是唯一安全的 SQL 编写方式。

python 复制代码
# ✅ 正确:使用参数化查询
with connection.cursor() as cursor:
    username = "alice' OR '1'='1"  # 恶意输入
    cursor.execute("SELECT * FROM users WHERE username = %s", (username,))
    # 参数会被转义,安全!

# ✅ 正确:查询参数
with connection.cursor() as cursor:
    cursor.execute("SELECT * FROM users WHERE age > %s AND is_active = %s", (18, True))

# ✅ 正确:LIKE 模糊查询(需要转义)
with connection.cursor() as cursor:
    keyword = "%abc%"  # 用户输入
    cursor.execute("SELECT * FROM users WHERE username LIKE %s", (keyword,))

❌ 错误做法:字符串拼接

概念:字符串拼接 SQL 是 SQL 注入攻击的根本原因。无论是否信任用户输入,都必须使用参数化查询。

python 复制代码
# ❌ 错误:绝对不要这样做!
with connection.cursor() as cursor:
    username = "alice' OR '1'='1"  # 恶意输入
    cursor.execute(f"SELECT * FROM users WHERE username = '{username}'")
    # 这会导致 SQL 注入攻击!

LIKE 查询防注入

概念:LIKE 通配符(%、_)也可能被攻击者利用。需要在转义普通特殊字符后,再将用户输入作为 LIKE 参数传递。

python 复制代码
# LIKE 查询需要转义特殊字符
import pymysql
import re

def escape_like(value):
    # 转义 LIKE 中的特殊字符:% _ \
    return value.replace("\\", "\\\\").replace("%", "\\%").replace("_", "\\_")

with connection.cursor() as cursor:
    keyword = "100%"  # 用户输入包含 %
    escaped = escape_like(keyword)
    cursor.execute("SELECT * FROM users WHERE username LIKE %s", (f"%{escaped}%",))

5.9 错误处理与重试

概念:错误处理

数据库操作可能遇到网络中断、连接超时、死锁等错误,需要适当的错误处理和重试机制。

错误类型与处理

概念:数据库错误分为操作性错误(网络中断、超时)和数据库错误(约束冲突、语法错误)。不同错误类型需要不同的处理策略。

python 复制代码
import pymysql
from pymysql.err import OperationalError, InterfaceError, DatabaseError
import time

def get_connection():
    return pymysql.connect(
        host="localhost",
        user="root",
        password="pwd",
        database="myapp",
        charset="utf8mb4"
    )

def query_with_retry(sql, params=None, max_retries=3, retry_delay=1):
    """带重试的查询"""
    for attempt in range(max_retries):
        try:
            connection = get_connection()
            with connection:
                with connection.cursor() as cursor:
                    if params:
                        cursor.execute(sql, params)
                    else:
                        cursor.execute(sql)
                    return cursor.fetchall()
        except (OperationalError, InterfaceError) as e:
            # 网络错误、连接断开
            print(f"尝试 {attempt + 1} 失败: {e}")
            if attempt < max_retries - 1:
                time.sleep(retry_delay)
            else:
                raise
        except DatabaseError as e:
            # 数据库错误(如死锁),稍后重试
            print(f"数据库错误: {e}")
            if attempt < max_retries - 1:
                time.sleep(retry_delay * (attempt + 1))  # 递增延迟
            else:
                raise

# 使用
results = query_with_retry("SELECT * FROM users")

连接失效自动重连

概念:数据库连接可能因超时或服务器重启而失效。通过 ping() 检测连接活性,失效时自动重建连接池,保证服务可用性。

python 复制代码
from dbutils.pooled_db import PooledDB
import pymysql

class ReconnectPoolDB:
    def __init__(self, **kwargs):
        self.kwargs = kwargs
        self.pool = None
        self._create_pool()

    def _create_pool(self):
        self.pool = PooledDB(
            creator=pymysql,
            maxconnections=20,
            mincached=5,
            ping=1,  # 使用前检测连接
            **self.kwargs
        )

    def connection(self):
        try:
            conn = self.pool.connection()
            # 测试连接
            conn.ping(reconnect=True)
            return conn
        except Exception:
            # 连接失效,重新创建池
            self._create_pool()
            return self.pool.connection()

# 使用
pool = ReconnectPoolDB(
    host="localhost",
    user="root",
    password="pwd",
    database="myapp"
)

5.10 批量插入优化

概念:批量插入

当需要插入大量数据时,循环单条插入效率很低。优化方式包括 executemany、LOAD DATA INFILE、批量提交。

executemany 批量插入

概念:executemany 是单次 API 调用执行多条 SQL 的高效方式。比循环多次 execute 减少网络往返次数,提升插入性能。

python 复制代码
# 普通批量插入
with connection.cursor() as cursor:
    sql = "INSERT INTO users (username, email) VALUES (%s, %s)"
    data = [
        ("user1", "user1@example.com"),
        ("user2", "user2@example.com"),
        # ... 10000 条
    ]
    cursor.executemany(sql, data)
    connection.commit()

批量插入优化策略

python 复制代码
def batch_insert_optimized(table, columns, values_batch, batch_size=1000):
    """
    大批量插入优化:分批插入 + 事务
    """
    total = len(values_batch)
    with connection.cursor() as cursor:
        sql = f"INSERT INTO {table} ({', '.join(columns)}) VALUES ({', '.join(['%s'] * len(columns))})"

        for i in range(0, total, batch_size):
            batch = values_batch[i:i + batch_size]
            cursor.executemany(sql, batch)
            connection.commit()
            print(f"已插入 {min(i + batch_size, total)}/{total} 条")

# 使用
batch_insert_optimized(
    table="users",
    columns=["username", "email", "age"],
    values_batch=[(f"user{i}", f"user{i}@example.com", i % 100) for i in range(100000)],
    batch_size=5000
)

LOAD DATA INFILE(最快)

概念:LOAD DATA INFILE 是 MySQL 服务器端批量导入命令,数据直接由服务器读取文件,避免了客户端与服务器的大量数据传输,是最快的导入方式。

python 复制代码
# MySQL 的 LOAD DATA INFILE 比 INSERT 快 10-20 倍
def load_data_infile(table, columns, filepath):
    """使用 LOAD DATA INFILE 批量导入"""
    import os
    if not os.path.isfile(filepath):
        raise FileNotFoundError(f"文件不存在: {filepath}")
    safe_name = os.path.basename(filepath)
    with connection.cursor() as cursor:
        cols = ", ".join(columns)
        sql = f"""
            LOAD DATA LOCAL INFILE %s
            INTO TABLE {table}
            FIELDS TERMINATED BY ','
            ENCLOSED BY '"'
            LINES TERMINATED BY '\\n'
            ({cols})
        """
        cursor.execute(sql, (safe_name,))
        connection.commit()

# 准备 CSV 文件
# username,email,age
# user1,user1@example.com,25
# user2,user2@example.com,30

异步批量插入(aiomysql)

概念:异步批量插入结合 aiomysql 和 asyncio,适用于 FastAPI 等异步框架的高并发场景,可以批量插入大量数据而不阻塞事件循环。

python 复制代码
import asyncio
import aiomysql

async def batch_insert_async(pool, users):
    """异步批量插入"""
    async with pool.acquire() as conn:
        async with conn.cursor() as cursor:
            sql = "INSERT INTO users (username, email) VALUES (%s, %s)"
            # aiomysql 支持 executemany
            await cursor.executemany(sql, users)
            await conn.commit()

async def main():
    pool = await aiomysql.create_pool(host="localhost", user="root", password="pwd", db="myapp")
    try:
        users = [(f"user{i}", f"user{i}@example.com") for i in range(10000)]
        await batch_insert_async(pool, users)
    finally:
        pool.close()
        await pool.wait_closed()

asyncio.run(main())

5.11 分页深入

概念:分页

分页有两种方式:OFFSET 分页(传统)和游标分页(性能更好)。大数据量时推荐游标分页。

OFFSET 分页(传统)

概念:OFFSET 分页通过 LIMIT offset, count 实现。简单直观,但 offset 过大会导致性能问题(数据库需扫描丢弃的行)。

python 复制代码
# 传统 OFFSET 分页
def paginate_offset(page, page_size):
    offset = (page - 1) * page_size
    with connection.cursor() as cursor:
        # 查询数据
        cursor.execute(
            "SELECT * FROM users ORDER BY id LIMIT %s OFFSET %s",
            (page_size, offset)
        )
        data = cursor.fetchall()

        # 查询总数(性能开销大)
        cursor.execute("SELECT COUNT(*) FROM users")
        total = cursor.fetchone()[0]

        return {
            "data": data,
            "page": page,
            "page_size": page_size,
            "total": total,
            "total_pages": (total + page_size - 1) // page_size
        }

游标分页(推荐大数据量)

概念:游标分页基于上一页最后一条的 ID 进行查询,WHERE id > last_id。查询效率稳定,不受数据量影响,适合无限滚动场景。

python 复制代码
# 游标分页:基于上一页最后一条的 ID
def paginate_cursor(last_id, page_size):
    """
    游标分页:性能更好,适合大数据量
    """
    with connection.cursor() as cursor:
        if last_id is None:
            cursor.execute(
                "SELECT * FROM users ORDER BY id LIMIT %s",
                (page_size,)
            )
        else:
            cursor.execute(
                "SELECT * FROM users WHERE id > %s ORDER BY id LIMIT %s",
                (last_id, page_size)
            )
        data = cursor.fetchall()

        next_cursor = data[-1][0] if data else None

        return {
            "data": data,
            "next_cursor": next_cursor,
            "has_more": len(data) == page_size
        }

# 使用
result = paginate_cursor(last_id=None, page_size=10)
while result["has_more"]:
    print(result["data"])
    result = paginate_cursor(last_id=result["next_cursor"], page_size=10)

OFFSET vs 游标分页

概念:两种分页方式各有优劣。OFFSET 适合需要跳页的场景(如点击页码),游标分页适合无限滚动和大数据量性能优化。

特性 OFFSET 分页 游标分页
实现复杂度 简单 稍复杂
大数据量性能 慢(OFFSET 越大越慢) 快(恒定时间)
支持跳页 支持 不支持
适用场景 小数据量、需跳页 大数据量、顺序浏览

5.12 数据库迁移

概念:数据库迁移

数据库迁移用于管理数据库结构的变更(创建表、修改字段等),确保多环境数据库结构一致。Alembic 是 SQLAlchemy 推荐的迁移工具。

Alembic 迁移

概念:Alembic 是 SQLAlchemy 官方推荐的数据库迁移工具,通过版本化的迁移脚本管理数据库结构变更,支持升级、降级和版本历史。

bash 复制代码
pip install alembic
bash 复制代码
# 初始化
alembic init alembic

# 生成迁移文件
alembic revision --autogenerate -m "Add users table"

# 升级
alembic upgrade head

# 降级
alembic downgrade -1

# 查看历史
alembic history

迁移文件示例

python 复制代码
# alembic/versions/xxxx_add_users.py
from alembic import op
import sqlalchemy as sa

def upgrade():
    op.create_table(
        'users',
        sa.Column('id', sa.Integer(), nullable=False),
        sa.Column('username', sa.String(50), nullable=False),
        sa.Column('email', sa.String(100), nullable=False),
        sa.PrimaryKeyConstraint('id'),
        sa.UniqueConstraint('username'),
        sa.UniqueConstraint('email')
    )

def downgrade():
    op.drop_table('users')

Django 迁移

概念:Django 自带 ORM 和迁移系统,makemigrations 自动生成迁移文件,migrate 执行变更。适合 Django 全栈项目快速开发。

bash 复制代码
# Django 自带迁移功能
python manage.py makemigrations  # 创建迁移
python manage.py migrate         # 执行迁移
python manage.py showmigrations  # 查看迁移状态

5.13 异步 ORM

概念:异步 ORM

异步 ORM 允许在异步环境中高效操作数据库,避免阻塞。SQLAlchemy 从 1.4 开始支持异步。

SQLAlchemy 异步

概念:SQLAlchemy 异步使用 create_async_engine 创建异步引擎,async_sessionmaker 创建会话工厂,配合 async/await 实现非阻塞数据库操作。

bash 复制代码
pip install sqlalchemy[asyncio] aiosqlite
python 复制代码
import asyncio
from sqlalchemy.ext.asyncio import create_async_engine, AsyncSession, async_sessionmaker
from sqlalchemy.orm import DeclarativeBase
from sqlalchemy import Column, Integer, String, select

class Base(DeclarativeBase):
    pass

class User(Base):
    __tablename__ = "users"
    id = Column(Integer, primary_key=True)
    username = Column(String(50))

engine = create_async_engine(
    "sqlite+aiosqlite:///myapp.db",
    echo=True
)

async_session = async_sessionmaker(engine, class_=AsyncSession)

# CRUD 操作
async def crud_operations():
    async with async_session() as session:
        # CREATE
        user = User(username="alice")
        session.add(user)
        await session.commit()

        # READ
        result = await session.execute(
            select(User).where(User.username == "alice")
        )
        user = result.scalar_one_or_none()

        # UPDATE
        user.username = "alice_new"
        await session.commit()

        # DELETE
        await session.delete(user)
        await session.commit()

asyncio.run(crud_operations())

5.14 实战案例:用户管理系统

完整 CRUD 示例

概念:用户管理系统演示了使用连接池 + contextmanager 的最佳实践,实现用户的注册、登录、信息更新、积分累计等完整业务功能。

python 复制代码
import pymysql
from dbutils.pooled_db import PooledDB
from contextlib import contextmanager

# 连接池
pool = PooledDB(
    creator=pymysql,
    maxconnections=20,
    mincached=5,
    host="localhost",
    user="root",
    password="pwd",
    database="myapp",
    charset="utf8mb4"
)

@contextmanager
def get_connection():
    conn = pool.connection()
    try:
        yield conn
    finally:
        conn.close()

class UserService:
    @staticmethod
    def create(username, email, age=0):
        with get_connection() as conn:
            with conn.cursor() as cursor:
                sql = "INSERT INTO users (username, email, age) VALUES (%s, %s, %s)"
                cursor.execute(sql, (username, email, age))
                conn.commit()
                return cursor.lastrowid

    @staticmethod
    def get_by_id(user_id):
        with get_connection() as conn:
            with conn.cursor(pymysql.cursors.DictCursor) as cursor:
                cursor.execute("SELECT * FROM users WHERE id = %s", (user_id,))
                return cursor.fetchone()

    @staticmethod
    def get_by_username(username):
        with get_connection() as conn:
            with conn.cursor(pymysql.cursors.DictCursor) as cursor:
                cursor.execute("SELECT * FROM users WHERE username = %s", (username,))
                return cursor.fetchone()

    @staticmethod
    def list_all(page=1, page_size=10):
        offset = (page - 1) * page_size
        with get_connection() as conn:
            with conn.cursor(pymysql.cursors.DictCursor) as cursor:
                cursor.execute("SELECT * FROM users ORDER BY id LIMIT %s OFFSET %s", (page_size, offset))
                return cursor.fetchall()

    @staticmethod
    def update(user_id, **kwargs):
        if not kwargs:
            return False
        allowed_fields = {"username", "email", "age", "is_active"}
        filtered = {k: v for k, v in kwargs.items() if k in allowed_fields}
        if not filtered:
            return False
        fields = ", ".join([f"{k} = %s" for k in filtered.keys()])
        values = list(filtered.values()) + [user_id]
        with get_connection() as conn:
            with conn.cursor() as cursor:
                sql = f"UPDATE users SET {fields} WHERE id = %s"
                cursor.execute(sql, values)
                conn.commit()
                return cursor.rowcount > 0

    @staticmethod
    def delete(user_id):
        with get_connection() as conn:
            with conn.cursor() as cursor:
                cursor.execute("DELETE FROM users WHERE id = %s", (user_id,))
                conn.commit()
                return cursor.rowcount > 0

    @staticmethod
    def search(keyword, page=1, page_size=10):
        offset = (page - 1) * page_size
        with get_connection() as conn:
            with conn.cursor(pymysql.cursors.DictCursor) as cursor:
                sql = """
                    SELECT * FROM users
                    WHERE username LIKE %s OR email LIKE %s
                    ORDER BY id LIMIT %s OFFSET %s
                """
                cursor.execute(sql, (f"%{keyword}%", f"%{keyword}%", page_size, offset))
                return cursor.fetchall()

    @staticmethod
    def batch_create(users_data):
        with get_connection() as conn:
            with conn.cursor() as cursor:
                sql = "INSERT INTO users (username, email, age) VALUES (%s, %s, %s)"
                cursor.executemany(sql, users_data)
                conn.commit()
                return cursor.rowcount

# 使用示例
if __name__ == "__main__":
    # 创建用户
    user_id = UserService.create("alice", "alice@example.com", 25)
    print(f"创建用户 ID: {user_id}")

    # 查询用户
    user = UserService.get_by_id(1)
    print(f"用户: {user}")

    # 更新用户
    UserService.update(1, age=26, is_active=True)

    # 分页查询
    users = UserService.list_all(page=1, page_size=10)
    for u in users:
        print(u)

    # 搜索
    results = UserService.search("alice")
    print(f"搜索结果: {results}")

    # 批量创建
    UserService.batch_create([
        ("bob", "bob@example.com", 30),
        ("charlie", "charlie@example.com", 28)
    ])

    # 删除用户
    UserService.delete(1)
相关推荐
TDengine (老段)12 小时前
TDengine 存储引擎概览 — TSDB 分层存储架构与数据流转全景
大数据·数据库·物联网·架构·时序数据库·tdengine·涛思数据
Full Stack Developme12 小时前
SQL like 与 正则 区别
数据库·sql·mysql
专注VB编程开发20年12 小时前
JAVA动态调用函数,数字类型,Java 反射允许自动拓宽类型。
开发语言·python
En^_^Joy13 小时前
Django开发:基本入门指南
python·django·sqlite
pixcarp13 小时前
Redis ZSet:底层设计与实践
数据库·redis·后端·学习·golang·web
我是一颗柠檬13 小时前
【MySQL全面教学】MySQL多表查询与JOIN Day6(2026年)
数据库·后端·sql·mysql
倒流时光三十年13 小时前
PostgreSQL COPY命令:高效数据导入的最佳实践
数据库·postgresql
Sinsa_SI13 小时前
2026算法应用主题赛初赛-小学4-6组(Python)试卷(含答案+详细解析)
开发语言·python·算法
繁星星繁13 小时前
Python语法(三)
开发语言·python