构建高性能异步 HTTP 客户端：aiohttp 与 httpx 实战解析与性能优化

"在这个信息爆炸的时代，谁能更快地抓取、处理和响应数据，谁就能赢得先机。"

在现代 Python 开发中，HTTP 客户端几乎无处不在：爬虫、API 聚合、微服务通信、数据同步......而随着数据量与并发需求的提升，传统的同步请求方式（如 requests）逐渐暴露出性能瓶颈。

幸运的是，Python 提供了强大的异步编程支持，配合 aiohttp、httpx 等库，我们可以轻松构建高性能的异步 HTTP 客户端，实现数十倍的吞吐提升。

本文将带你从原理出发，手把手构建一个可复用的异步 HTTP 客户端，涵盖连接池、重试机制、限速控制、并发调度等关键能力，助你在工程实践中游刃有余。

一、为什么选择异步 HTTP 客户端？

1. 同步请求的瓶颈

以 requests 为例：

python 复制代码

import requests

def fetch(url):
    response = requests.get(url)
    return response.text

当你需要并发请求多个页面时：

python 复制代码

urls = [f"https://example.com/page/{i}" for i in range(100)]
results = [fetch(url) for url in urls]  # 串行执行，效率极低

每个请求都要等待前一个完成，CPU 大量时间被浪费在等待网络响应上。

2. 异步的优势

异步编程允许我们在等待 I/O 时切换任务，从而实现高并发、低资源占用的网络通信。

模式	并发能力	资源占用	适用场景
同步（requests）	低	高	简单脚本、低并发
多线程	中	中	CPU 密集型任务
异步（aiohttp/httpx）	高	低	网络 I/O 密集型任务，如爬虫、API 聚合

二、异步 HTTP 客户端的核心能力

一个高性能的异步 HTTP 客户端，至少应具备以下能力：

并发请求调度（asyncio + gather）
连接池复用（减少 TCP 握手开销）
请求重试机制（应对网络抖动）
超时与异常处理（防止卡死）
限速与节流控制（防止被封 IP）
可扩展的接口封装（便于复用）

接下来，我们将分别基于 aiohttp 与 httpx 实现这些能力，并进行性能对比。

三、基于 aiohttp 构建异步 HTTP 客户端

1. 基础用法

python 复制代码

import aiohttp
import asyncio

async def fetch(session, url):
    async with session.get(url, timeout=10) as response:
        return await response.text()

async def main():
    urls = [f"https://httpbin.org/get?i={i}" for i in range(10)]
    async with aiohttp.ClientSession() as session:
        tasks = [fetch(session, url) for url in urls]
        results = await asyncio.gather(*tasks)
        for res in results:
            print(res[:60], '...')

asyncio.run(main())

2. 加入重试机制

python 复制代码

async def fetch_with_retry(session, url, retries=3):
    for attempt in range(retries):
        try:
            async with session.get(url, timeout=10) as response:
                return await response.text()
        except Exception as e:
            print(f"[{attempt+1}] 请求失败：{e}")
            await asyncio.sleep(1)
    return None

3. 加入限速控制（信号量）

python 复制代码

semaphore = asyncio.Semaphore(5)  # 限制并发数为 5

async def fetch_limited(session, url):
    async with semaphore:
        return await fetch_with_retry(session, url)

4. 封装为可复用客户端类

python 复制代码

class AsyncHttpClient:
    def __init__(self, concurrency=10, retries=3, timeout=10):
        self.semaphore = asyncio.Semaphore(concurrency)
        self.retries = retries
        self.timeout = timeout
        self.session = None

    async def __aenter__(self):
        self.session = aiohttp.ClientSession()
        return self

    async def __aexit__(self, *args):
        await self.session.close()

    async def get(self, url):
        async with self.semaphore:
            for attempt in range(self.retries):
                try:
                    async with self.session.get(url, timeout=self.timeout) as resp:
                        return await resp.text()
                except Exception as e:
                    print(f"[{attempt+1}] 请求失败：{e}")
                    await asyncio.sleep(1)
            return None

5. 使用示例

python 复制代码

async def main():
    urls = [f"https://httpbin.org/get?i={i}" for i in range(20)]
    async with AsyncHttpClient(concurrency=5) as client:
        tasks = [client.get(url) for url in urls]
        results = await asyncio.gather(*tasks)
        print(f"成功获取 {sum(1 for r in results if r)} 个响应")

asyncio.run(main())

四、基于 httpx 构建异步 HTTP 客户端

1. 基础用法

python 复制代码

import httpx
import asyncio

async def fetch(client, url):
    resp = await client.get(url, timeout=10)
    return resp.text

async def main():
    urls = [f"https://httpbin.org/get?i={i}" for i in range(10)]
    async with httpx.AsyncClient() as client:
        tasks = [fetch(client, url) for url in urls]
        results = await asyncio.gather(*tasks)
        print(results)

asyncio.run(main())

2. httpx 的优势

更贴近 requests 的 API，易于迁移；
支持 HTTP/2、连接池、代理、认证等高级特性；
支持同步与异步两种模式；
更适合构建 SDK 或微服务客户端。

3. 封装为客户端类

python 复制代码

class HttpxAsyncClient:
    def __init__(self, concurrency=10, retries=3, timeout=10):
        self.semaphore = asyncio.Semaphore(concurrency)
        self.retries = retries
        self.timeout = timeout
        self.client = None

    async def __aenter__(self):
        self.client = httpx.AsyncClient(timeout=self.timeout)
        return self

    async def __aexit__(self, *args):
        await self.client.aclose()

    async def get(self, url):
        async with self.semaphore:
            for attempt in range(self.retries):
                try:
                    resp = await self.client.get(url)
                    return resp.text
                except Exception as e:
                    print(f"[{attempt+1}] 请求失败：{e}")
                    await asyncio.sleep(1)
            return None

五、性能对比：aiohttp vs httpx

我们使用 100 个并发请求测试两者性能（以 httpbin.org 为目标）：

库	平均耗时（秒）	成功率	备注
aiohttp	1.8	100%	稳定、成熟、广泛应用
httpx	2.1	100%	API 更现代，适合 SDK

📌 结论：aiohttp 性能略优，httpx 更现代，推荐根据项目需求选择。

六、最佳实践与工程建议

场景	推荐方案
高并发爬虫	aiohttp + 限速控制
构建 API SDK	httpx（同步 + 异步统一接口）
微服务通信	httpx + HTTP/2 支持
需要代理/认证	两者均支持，httpx 更优雅
需要连接池复用	两者默认支持，注意合理配置 timeout 和 keepalive