title: MongoDB索引优化的艺术：从基础原理到性能调优实战

date: 2025/05/21 18:08:22

updated: 2025/05/21 18:08:22

excerpt:

MongoDB索引优化与性能调优的核心策略包括：索引基础原理，如单字段、复合、唯一和TTL索引；索引创建与管理，通过FastAPI集成Motor实现；查询性能优化，使用Explain分析、覆盖查询和聚合管道优化；实战案例，如电商平台订单查询优化；常见报错解决方案，如索引创建失败、查询性能下降和文档扫描过多问题。这些策略能显著提升查询速度和系统性能。

categories:

后端开发
FastAPI

tags:

MongoDB
索引优化
性能调优
FastAPI
查询分析
聚合管道
错误处理

扫描二维码

关注或者微信搜一搜：编程智域前端至全栈交流与成长

探索数千个预构建的 AI 应用，开启你的下一个伟大创意：https://tools.cmdragon.cn/

第四章：索引优化策略与性能调优

1. MongoDB索引基础原理

在MongoDB中，索引相当于图书的目录系统。当集合存储量达到百万级时，合理的索引设计能让查询速度提升10-100倍。索引本质上是特殊的数据结构（B-Tree），存储着字段值的排序副本。

主要索引类型：

python 复制代码

# 单字段索引示例
async def create_single_index():
    await db.products.create_index("name")


# 复合索引示例（注意字段顺序）
async def create_compound_index():
    await db.orders.create_index([("user_id", 1), ("order_date", -1)])


# 唯一索引示例
async def create_unique_index():
    await db.users.create_index("email", unique=True)


# TTL索引示例（自动过期）
async def create_ttl_index():
    await db.logs.create_index("created_at", expireAfterSeconds=3600)

2. 索引创建与管理实战

在FastAPI中集成Motor进行索引管理的最佳实践：

python 复制代码

from fastapi import APIRouter
from motor.motor_asyncio import AsyncIOMotorClient
from pydantic import BaseModel

router = APIRouter()


class Product(BaseModel):
    name: str
    price: float
    category: str


# 连接MongoDB
client = AsyncIOMotorClient("mongodb://localhost:27017")
db = client["ecommerce"]


@router.on_event("startup")
async def initialize_indexes():
    # 创建复合索引
    await db.products.create_index([("category", 1), ("price", -1)])

    # 文本搜索索引
    await db.products.create_index([("name", "text")])

    # 地理位置索引（需2dsphere）
    await db.stores.create_index([("location", "2dsphere")])

3. 查询性能优化策略

3.1 使用Explain分析查询

python 复制代码

async def analyze_query():
    cursor = db.products.find({"price": {"$gt": 100}})
    explain = await cursor.explain()
    print(f"使用的索引：{explain['queryPlanner']['winningPlan']['inputStage']['indexName']}")
    print(f"扫描文档数：{explain['executionStats']['totalDocsExamined']}")

3.2 覆盖查询优化

python 复制代码

async def covered_query():
    # 只查询索引包含的字段
    projection = {"_id": 0, "name": 1, "category": 1}
    cursor = db.products.find({"category": "electronics"}, projection)
    async for doc in cursor:
        print(doc)

3.3 聚合管道优化

python 复制代码

async def optimized_aggregation():
    pipeline = [
        {"$match": {"status": "completed"}},
        {"$sort": {"total_amount": -1}},
        {"$group": {
            "_id": "$user_id",
            "total_spent": {"$sum": "$total_amount"}
        }},
        {"$limit": 10}
    ]

    # 添加hint强制使用索引
    cursor = db.orders.aggregate(pipeline).hint([("status", 1), ("total_amount", -1)])
    results = await cursor.to_list(length=10)
    return results

4. 性能调优实战案例

电商平台订单查询优化：

python 复制代码

class OrderQuery(BaseModel):
    user_id: str
    start_date: datetime
    end_date: datetime
    min_amount: float = None


@router.post("/orders/search")
async def search_orders(query: OrderQuery):
    # 构建查询条件
    conditions = {
        "user_id": query.user_id,
        "order_date": {"$gte": query.start_date, "$lte": query.end_date}
    }
    if query.min_amount:
        conditions["total_amount"] = {"$gte": query.min_amount}

    # 使用复合索引优化查询
    projection = {"_id": 0, "order_id": 1, "total_amount": 1, "items": 1}
    cursor = db.orders.find(
        conditions,
        projection
    ).sort("order_date", -1).hint([("user_id", 1), "order_date", -1)])

    return await cursor.to_list(length=100)

5. 课后Quiz

Q1：以下哪种索引顺序更适合查询db.orders.find({"status":"shipped", "total":{$gt:100}}).sort("ship_date":1)？

A) (status, total, ship_date)

B) (status, ship_date, total)

C) (ship_date, status, total)

正确答案：A

解析：等值查询字段(status)应放在最前，范围查询字段(total)在后，排序字段(ship_date)在最后可以避免内存排序

Q2：如何判断查询是否使用了覆盖索引？

A) 检查执行时间

B) 查看explain输出中的totalDocsExamined

C) 观察返回字段是否都在索引中

正确答案：C

解析：覆盖查询需要所有返回字段都包含在索引中，且查询不包含_id字段或显式排除

6. 常见报错解决方案

报错1：OperationFailure: Error creating index

原因：尝试在已存在重复值的字段上创建唯一索引

解决：

python 复制代码

# 先清理重复数据
async def clean_duplicate_emails():
    pipeline = [
        {"$group": {"_id": "$email", "dups": {"$push": "$_id"}, "count": {"$sum": 1}}},
        {"$match": {"count": {"$gt": 1}}}
    ]
    async for dup in db.users.aggregate(pipeline):
        await db.users.delete_many({"_id": {"$in": dup["dups"][1:]}})

报错2：查询性能突然下降

可能原因：索引碎片化或统计信息过期

解决：

python 复制代码

# 重建索引
async def rebuild_indexes():
    await db.products.drop_index("category_1_price_-1")
    await db.products.create_index([("category", 1), ("price", -1)])

报错3：Executor error during find command: Too many documents scanned

原因：查询未命中索引或索引选择不当

解决：

使用explain分析查询计划
添加适当的索引
优化查询条件，减少扫描范围

运行环境要求：

Python 3.8+
FastAPI==0.78.0
motor==3.1.1
pydantic==1.10.7
uvicorn==0.18.2

安装命令：

bash 复制代码

pip install fastapi==0.78.0 motor==3.1.1 pydantic==1.10.7 uvicorn==0.18.2

余下文章内容请点击跳转至个人博客页面或者扫码关注或者微信搜一搜：编程智域前端至全栈交流与成长，阅读完整的文章：MongoDB索引优化的艺术：从基础原理到性能调优实战 | cmdragon's Blog

MongoDB索引优化的艺术：从基础原理到性能调优

第四章：索引优化策略与性能调优

1. MongoDB索引基础原理

2. 索引创建与管理实战

3. 查询性能优化策略

3.1 使用Explain分析查询

3.2 覆盖查询优化

3.3 聚合管道优化

4. 性能调优实战案例

5. 课后Quiz

6. 常见报错解决方案

往期文章归档：