「milvus-course-ai.zip」
链接:https://pan.quark.cn/s/00f3d411bb6d
github:https://github.com/yuanmomoya/milvus
学习目标
学完本章后,你应该能够:
- 理解 Milvus 写入链路和 Segment 生成机制。
- 选择合适的 batch_size 和并发度。
- 避免频繁 flush 导致的 Segment 碎片。
- 实现带重试、进度追踪和错误处理的生产级写入流程。
- 评估和优化写入吞吐量。
写入链路回顾
MinIO/S3 DataNode WAL/MQ Milvus Proxy 应用 MinIO/S3 DataNode WAL/MQ Milvus Proxy 应用 #mermaid-svg-RFViTyQQCbr1nQ8Y{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-RFViTyQQCbr1nQ8Y .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-RFViTyQQCbr1nQ8Y .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-RFViTyQQCbr1nQ8Y .error-icon{fill:#552222;}#mermaid-svg-RFViTyQQCbr1nQ8Y .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-RFViTyQQCbr1nQ8Y .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-RFViTyQQCbr1nQ8Y .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-RFViTyQQCbr1nQ8Y .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-RFViTyQQCbr1nQ8Y .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-RFViTyQQCbr1nQ8Y .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-RFViTyQQCbr1nQ8Y .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-RFViTyQQCbr1nQ8Y .marker{fill:#333333;stroke:#333333;}#mermaid-svg-RFViTyQQCbr1nQ8Y .marker.cross{stroke:#333333;}#mermaid-svg-RFViTyQQCbr1nQ8Y svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-RFViTyQQCbr1nQ8Y p{margin:0;}#mermaid-svg-RFViTyQQCbr1nQ8Y .actor{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;}#mermaid-svg-RFViTyQQCbr1nQ8Y text.actor>tspan{fill:black;stroke:none;}#mermaid-svg-RFViTyQQCbr1nQ8Y .actor-line{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);}#mermaid-svg-RFViTyQQCbr1nQ8Y .innerArc{stroke-width:1.5;stroke-dasharray:none;}#mermaid-svg-RFViTyQQCbr1nQ8Y .messageLine0{stroke-width:1.5;stroke-dasharray:none;stroke:#333;}#mermaid-svg-RFViTyQQCbr1nQ8Y .messageLine1{stroke-width:1.5;stroke-dasharray:2,2;stroke:#333;}#mermaid-svg-RFViTyQQCbr1nQ8Y #arrowhead path{fill:#333;stroke:#333;}#mermaid-svg-RFViTyQQCbr1nQ8Y .sequenceNumber{fill:white;}#mermaid-svg-RFViTyQQCbr1nQ8Y #sequencenumber{fill:#333;}#mermaid-svg-RFViTyQQCbr1nQ8Y #crosshead path{fill:#333;stroke:#333;}#mermaid-svg-RFViTyQQCbr1nQ8Y .messageText{fill:#333;stroke:none;}#mermaid-svg-RFViTyQQCbr1nQ8Y .labelBox{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;}#mermaid-svg-RFViTyQQCbr1nQ8Y .labelText,#mermaid-svg-RFViTyQQCbr1nQ8Y .labelText>tspan{fill:black;stroke:none;}#mermaid-svg-RFViTyQQCbr1nQ8Y .loopText,#mermaid-svg-RFViTyQQCbr1nQ8Y .loopText>tspan{fill:black;stroke:none;}#mermaid-svg-RFViTyQQCbr1nQ8Y .loopLine{stroke-width:2px;stroke-dasharray:2,2;stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);}#mermaid-svg-RFViTyQQCbr1nQ8Y .note{stroke:#aaaa33;fill:#fff5ad;}#mermaid-svg-RFViTyQQCbr1nQ8Y .noteText,#mermaid-svg-RFViTyQQCbr1nQ8Y .noteText>tspan{fill:black;stroke:none;}#mermaid-svg-RFViTyQQCbr1nQ8Y .activation0{fill:#f4f4f4;stroke:#666;}#mermaid-svg-RFViTyQQCbr1nQ8Y .activation1{fill:#f4f4f4;stroke:#666;}#mermaid-svg-RFViTyQQCbr1nQ8Y .activation2{fill:#f4f4f4;stroke:#666;}#mermaid-svg-RFViTyQQCbr1nQ8Y .actorPopupMenu{position:absolute;}#mermaid-svg-RFViTyQQCbr1nQ8Y .actorPopupMenuPanel{position:absolute;fill:#ECECFF;box-shadow:0px 8px 16px 0px rgba(0,0,0,0.2);filter:drop-shadow(3px 5px 2px rgb(0 0 0 / 0.4));}#mermaid-svg-RFViTyQQCbr1nQ8Y .actor-man line{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;}#mermaid-svg-RFViTyQQCbr1nQ8Y .actor-man circle,#mermaid-svg-RFViTyQQCbr1nQ8Y line{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;stroke-width:2px;}#mermaid-svg-RFViTyQQCbr1nQ8Y :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} 异步消费 upsert(batch) 参数校验、分片路由 写入消息 ack 返回成功 消费写入消息 累积到 growing segment 达到阈值 → seal segment flush binlog 到对象存储
关键认知:
- upsert 返回成功 = 数据进入 WAL,不等于索引已构建
- Growing segment 中的数据可被搜索(暴力扫描),但性能不如索引
- Segment seal 后由 IndexNode 异步构建索引
batch_size 选择
影响因素
| 因素 | batch_size 太小 | batch_size 太大 |
|---|---|---|
| 网络开销 | 每批都有 RPC 开销,总耗时长 | 单次传输数据量大 |
| 内存 | 无问题 | 客户端和 Proxy 内存峰值高 |
| 超时风险 | 无 | 单批处理时间可能超时 |
| Segment 效率 | 可能产生更多小 Segment | 更容易形成大 Segment |
推荐值
| 向量维度 | 推荐 batch_size | 单批数据量估算 |
|---|---|---|
| 128 维 | 2000-5000 | 1-2.5 MB |
| 512 维 | 1000-3000 | 2-6 MB |
| 768 维 | 500-2000 | 1.5-6 MB |
| 1536 维 | 500-1000 | 3-6 MB |
经验法则:单批数据量控制在 2-8 MB,对应的 batch_size 根据维度调整。
测试不同 batch_size
python
import time
import numpy as np
from pymilvus import MilvusClient
client = MilvusClient(uri="http://localhost:19530")
DIM = 768
TOTAL = 50_000
def benchmark_batch_size(batch_size: int) -> float:
"""测试指定 batch_size 的写入吞吐"""
start = time.perf_counter()
for i in range(0, TOTAL, batch_size):
size = min(batch_size, TOTAL - i)
vectors = np.random.randn(size, DIM).astype("float32")
norms = np.linalg.norm(vectors, axis=1, keepdims=True)
vectors = (vectors / norms).tolist()
data = [{"id": str(i + j), "embedding": vectors[j]} for j in range(size)]
client.upsert(collection_name="bench_write", data=data)
elapsed = time.perf_counter() - start
throughput = TOTAL / elapsed
print(f"batch_size={batch_size:5d} 耗时={elapsed:.1f}s 吞吐={throughput:.0f} rows/s")
return throughput
# 对比
for bs in [100, 500, 1000, 2000, 5000]:
benchmark_batch_size(bs)
并发写入
多线程写入
python
import concurrent.futures
import threading
import time
import numpy as np
from pymilvus import MilvusClient
def parallel_upsert(
uri: str,
collection_name: str,
data: list[dict],
batch_size: int = 1000,
max_workers: int = 4,
) -> int:
"""多线程并发写入"""
# 每个线程使用独立的 client 实例
local = threading.local()
def get_client():
if not hasattr(local, "client"):
local.client = MilvusClient(uri=uri)
return local.client
def write_batch(batch: list[dict]) -> int:
client = get_client()
result = client.upsert(collection_name=collection_name, data=batch)
return result["upsert_count"]
# 分批
batches = [data[i:i + batch_size] for i in range(0, len(data), batch_size)]
total = 0
with concurrent.futures.ThreadPoolExecutor(max_workers=max_workers) as executor:
futures = {executor.submit(write_batch, batch): i for i, batch in enumerate(batches)}
for future in concurrent.futures.as_completed(futures):
total += future.result()
return total
并发度建议
| 场景 | 推荐并发度 | 说明 |
|---|---|---|
| Standalone 本地 | 2-4 | 单节点资源有限 |
| Standalone 生产 | 4-8 | 取决于 CPU 和网络 |
| 集群模式 | 8-16 | 多 DataNode 可并行消费 |
过高的并发度会导致 Proxy 排队、内存压力增大,反而降低吞吐。
避免 Segment 碎片
问题:频繁 flush
#mermaid-svg-hMs9d8ftMRDrn3Rx{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-hMs9d8ftMRDrn3Rx .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-hMs9d8ftMRDrn3Rx .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-hMs9d8ftMRDrn3Rx .error-icon{fill:#552222;}#mermaid-svg-hMs9d8ftMRDrn3Rx .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-hMs9d8ftMRDrn3Rx .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-hMs9d8ftMRDrn3Rx .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-hMs9d8ftMRDrn3Rx .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-hMs9d8ftMRDrn3Rx .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-hMs9d8ftMRDrn3Rx .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-hMs9d8ftMRDrn3Rx .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-hMs9d8ftMRDrn3Rx .marker{fill:#333333;stroke:#333333;}#mermaid-svg-hMs9d8ftMRDrn3Rx .marker.cross{stroke:#333333;}#mermaid-svg-hMs9d8ftMRDrn3Rx svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-hMs9d8ftMRDrn3Rx p{margin:0;}#mermaid-svg-hMs9d8ftMRDrn3Rx .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-hMs9d8ftMRDrn3Rx .cluster-label text{fill:#333;}#mermaid-svg-hMs9d8ftMRDrn3Rx .cluster-label span{color:#333;}#mermaid-svg-hMs9d8ftMRDrn3Rx .cluster-label span p{background-color:transparent;}#mermaid-svg-hMs9d8ftMRDrn3Rx .label text,#mermaid-svg-hMs9d8ftMRDrn3Rx span{fill:#333;color:#333;}#mermaid-svg-hMs9d8ftMRDrn3Rx .node rect,#mermaid-svg-hMs9d8ftMRDrn3Rx .node circle,#mermaid-svg-hMs9d8ftMRDrn3Rx .node ellipse,#mermaid-svg-hMs9d8ftMRDrn3Rx .node polygon,#mermaid-svg-hMs9d8ftMRDrn3Rx .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-hMs9d8ftMRDrn3Rx .rough-node .label text,#mermaid-svg-hMs9d8ftMRDrn3Rx .node .label text,#mermaid-svg-hMs9d8ftMRDrn3Rx .image-shape .label,#mermaid-svg-hMs9d8ftMRDrn3Rx .icon-shape .label{text-anchor:middle;}#mermaid-svg-hMs9d8ftMRDrn3Rx .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-hMs9d8ftMRDrn3Rx .rough-node .label,#mermaid-svg-hMs9d8ftMRDrn3Rx .node .label,#mermaid-svg-hMs9d8ftMRDrn3Rx .image-shape .label,#mermaid-svg-hMs9d8ftMRDrn3Rx .icon-shape .label{text-align:center;}#mermaid-svg-hMs9d8ftMRDrn3Rx .node.clickable{cursor:pointer;}#mermaid-svg-hMs9d8ftMRDrn3Rx .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-hMs9d8ftMRDrn3Rx .arrowheadPath{fill:#333333;}#mermaid-svg-hMs9d8ftMRDrn3Rx .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-hMs9d8ftMRDrn3Rx .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-hMs9d8ftMRDrn3Rx .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-hMs9d8ftMRDrn3Rx .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-hMs9d8ftMRDrn3Rx .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-hMs9d8ftMRDrn3Rx .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-hMs9d8ftMRDrn3Rx .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-hMs9d8ftMRDrn3Rx .cluster text{fill:#333;}#mermaid-svg-hMs9d8ftMRDrn3Rx .cluster span{color:#333;}#mermaid-svg-hMs9d8ftMRDrn3Rx div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-hMs9d8ftMRDrn3Rx .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-hMs9d8ftMRDrn3Rx rect.text{fill:none;stroke-width:0;}#mermaid-svg-hMs9d8ftMRDrn3Rx .icon-shape,#mermaid-svg-hMs9d8ftMRDrn3Rx .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-hMs9d8ftMRDrn3Rx .icon-shape p,#mermaid-svg-hMs9d8ftMRDrn3Rx .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-hMs9d8ftMRDrn3Rx .icon-shape .label rect,#mermaid-svg-hMs9d8ftMRDrn3Rx .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-hMs9d8ftMRDrn3Rx .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-hMs9d8ftMRDrn3Rx .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-hMs9d8ftMRDrn3Rx :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} 每写 100 条就 flush
产生大量小 Segment
搜索时需要扫描更多 Segment
搜索延迟增加
Compaction 压力增大
正确做法
python
# 错误:每批都 flush
for batch in batches:
client.upsert(collection_name="docs", data=batch)
client.flush(collection_name="docs") # 不要这样做!
# 正确:让 Milvus 自动管理 flush
for batch in batches:
client.upsert(collection_name="docs", data=batch)
# 写入完成后,如果需要立即搜索到所有数据,可以 flush 一次
# client.flush(collection_name="docs") # 仅在必要时
Milvus 自动 flush 机制
Milvus 会在以下条件下自动 seal segment:
- Growing segment 达到
dataCoord.segment.maxSize(默认 512MB) - Growing segment 达到
sealProportion比例 - 超过一定时间未写入
配置参考(milvus.yaml):
yaml
dataCoord:
segment:
maxSize: 512 # Segment 最大 512MB
sealProportion: 0.12 # 达到 12% 时 seal
生产级写入流程
python
import logging
import time
from dataclasses import dataclass
from typing import Any
from pymilvus import MilvusClient
from pymilvus.exceptions import MilvusException, MilvusUnavailableException
logger = logging.getLogger(__name__)
@dataclass
class WriteResult:
total_written: int
total_failed: int
elapsed_seconds: float
throughput: float # rows/s
def production_bulk_write(
client: MilvusClient,
collection_name: str,
data: list[dict[str, Any]],
batch_size: int = 1000,
max_retries: int = 3,
retry_delay: float = 2.0,
progress_interval: int = 10,
) -> WriteResult:
"""生产级批量写入:分批、重试、进度、错误统计"""
start_time = time.perf_counter()
total_written = 0
total_failed = 0
total_batches = (len(data) + batch_size - 1) // batch_size
for batch_idx in range(0, len(data), batch_size):
batch = data[batch_idx : batch_idx + batch_size]
batch_num = batch_idx // batch_size + 1
success = False
for attempt in range(max_retries):
try:
result = client.upsert(collection_name=collection_name, data=batch)
total_written += result["upsert_count"]
success = True
break
except MilvusUnavailableException as e:
delay = retry_delay * (2 ** attempt)
logger.warning(
"批次 %d/%d 写入失败 (attempt %d): %s, %.1fs 后重试",
batch_num, total_batches, attempt + 1, e, delay,
)
time.sleep(delay)
except MilvusException as e:
logger.error("批次 %d/%d 不可重试错误: %s", batch_num, total_batches, e)
break
if not success:
total_failed += len(batch)
logger.error("批次 %d/%d 最终失败,跳过 %d 条", batch_num, total_batches, len(batch))
# 进度日志
if batch_num % progress_interval == 0:
elapsed = time.perf_counter() - start_time
speed = total_written / elapsed if elapsed > 0 else 0
logger.info(
"进度: %d/%d 批次, 已写入 %d 条, 失败 %d 条, 速度 %.0f rows/s",
batch_num, total_batches, total_written, total_failed, speed,
)
elapsed = time.perf_counter() - start_time
throughput = total_written / elapsed if elapsed > 0 else 0
logger.info(
"写入完成: %d 条成功, %d 条失败, 耗时 %.1fs, 吞吐 %.0f rows/s",
total_written, total_failed, elapsed, throughput,
)
return WriteResult(
total_written=total_written,
total_failed=total_failed,
elapsed_seconds=elapsed,
throughput=throughput,
)
insert vs upsert 性能
| 操作 | 行为 | 性能 | 适用场景 |
|---|---|---|---|
insert |
纯插入,主键冲突报错 | 略快 | 确定数据不重复 |
upsert |
存在则更新,不存在则插入 | 略慢(需查主键) | 增量同步、幂等写入 |
性能差异通常 < 10%。如果需要幂等性(重复运行不产生重复数据),优先用 upsert。
大规模数据导入
场景:百万级初始导入
python
import numpy as np
from pathlib import Path
def generate_and_write(
client: MilvusClient,
collection_name: str,
total: int,
dim: int,
batch_size: int = 2000,
):
"""大规模数据生成并写入"""
written = 0
for i in range(0, total, batch_size):
size = min(batch_size, total - i)
# 生成向量(实际场景中从文件或模型获取)
vectors = np.random.randn(size, dim).astype("float32")
norms = np.linalg.norm(vectors, axis=1, keepdims=True)
vectors = vectors / norms
data = [
{"id": f"doc-{i+j:08d}", "embedding": vectors[j].tolist()}
for j in range(size)
]
client.upsert(collection_name=collection_name, data=data)
written += size
if written % 50000 == 0:
print(f"已写入: {written:,} / {total:,}")
print(f"导入完成: {written:,} 条")
导入后优化
大批量导入后建议:
python
# 1. 等待索引构建完成
import time
while True:
info = client.describe_collection(collection_name)
# 检查索引状态...
time.sleep(5)
# 2. 执行一次 compaction 合并小 Segment
# Milvus 会自动 compaction,但大量导入后可以手动触发
# (通过 REST API 或等待自动触发)
# 3. 验证数据量
stats = client.get_collection_stats(collection_name)
print(f"总行数: {stats['row_count']}")
写入性能调优清单
| 优化项 | 方法 | 预期效果 |
|---|---|---|
| batch_size | 调整到 1000-2000 | 减少 RPC 开销 |
| 并发度 | 2-4 线程 | 提高吞吐 30-100% |
| 避免 flush | 不手动 flush | 避免 Segment 碎片 |
| 减少字段 | 只写入必要字段 | 减少序列化开销 |
| 预生成向量 | 离线批量编码 | 避免写入时等待模型推理 |
| 网络 | 客户端与 Milvus 同机房 | 减少网络延迟 |
常见错误
| 现象 | 原因 | 修复 |
|---|---|---|
| 写入超时 | batch_size 太大 | 减小到 1000 以下 |
| 吞吐量低 | 单线程 + 小 batch | 增大 batch_size + 多线程 |
| Segment 过多 | 频繁手动 flush | 删除 flush 调用 |
| 写入后搜索不到 | 数据在 growing segment,索引未构建 | 等待自动 flush 或手动 flush 一次 |
| OOM | 一次性加载全部数据到内存 | 流式读取 + 分批写入 |
| 主键冲突 | insert 遇到重复主键 | 改用 upsert |
面试题
-
为什么不建议每批都 flush?
每次 flush 会 seal 当前 growing segment,产生小 Segment。大量小 Segment 导致搜索时需要扫描更多文件,延迟增加。Milvus 的自动 flush 机制会在 Segment 达到合适大小时 seal。
-
upsert 返回成功后数据一定能被搜索到吗?
能被搜索到,但可能是暴力扫描(growing segment 未建索引)。如果需要索引搜索的性能,需要等待 Segment seal 和索引构建完成。
-
并发写入时为什么每个线程要用独立的 client?
MilvusClient 内部维护连接状态。多线程共享同一个 client 可能导致连接竞争。每个线程独立 client 保证连接隔离。
-
大规模导入后为什么搜索可能变慢?
大量写入可能产生很多未合并的 Segment。等待 Compaction 完成后性能会恢复。也可以通过增大
segment.maxSize减少 Segment 数量。 -
如何估算写入吞吐量的理论上限?
受限于:网络带宽(batch 大小 × QPS)、Proxy CPU(序列化)、WAL 写入速度、DataNode 消费速度。通常 Standalone 模式下 10-50K rows/s,集群模式可线性扩展。
练习题
-
batch_size 调优:准备 10 万条 768 维数据,分别用 batch_size=100、500、1000、2000、5000 写入,记录总耗时和吞吐量。画出 batch_size-throughput 曲线。
-
并发度实验:固定 batch_size=1000,分别用 1、2、4、8 线程写入 10 万条数据,对比吞吐量。找到你环境下的最优并发度。
-
flush 影响:写入 5 万条数据,一组每 1000 条 flush 一次,另一组不 flush。对比写入后的搜索延迟和 Segment 数量。
-
断点续传:模拟写入过程中 Milvus 重启的场景。设计一个基于主键的断点续传机制,确保重启后不重复写入。
小结
批量写入优化的核心:合适的 batch_size(1000-2000)、适度的并发(2-4 线程)、不手动 flush。生产代码必须有重试机制和进度追踪。写入吞吐量的瓶颈通常在网络和 Proxy,而不是存储。