CSDN 技术教程系列:文本与向量检索实战(.NET C# 体系)
系列主题:从内存到 Elasticsearch ------ .NET C# 体系下的文本、向量检索技术演进与应用实例教程
目标读者:中高级 .NET 后端开发工程师、AI应用开发者、技术架构师
技术栈:.NET 8/9、C# 12、ONNX Runtime、BGE-M3、CLIP、Elasticsearch、Python Flask
📚 文章系列规划(共5篇)
| 序号 | 文章标题 | 核心技术 |
|---|---|---|
| 1 | [BGE-M3 多语言向量模型实战:.NET C# 从原理到落地](# 从原理到落地 BGE-M3、ONNX Runtime、Tokenizer 2 内存向量检索引擎设计与实现:C# 轻量级 Milvus 替代方案 内存计算、读写锁、并行检索 3 Elasticsearch 语义搜索实战:.NET 向量+关键词混合检索 ES 8.x、Dense Vector、Hybrid Search 4 CLIP 多模态搜索实战:.NET + Python 跨语言图片检索 OpenCLIP、Python Flask、跨模态 5 从内存到 ES:.NET 企业级向量检索架构演进之路 架构设计、性能优化、容灾策略) | BGE-M3、ONNX Runtime、Tokenizer |
| 2 | [内存向量检索引擎设计与实现:C# 轻量级 Milvus 替代方案](# 从原理到落地 BGE-M3、ONNX Runtime、Tokenizer 2 内存向量检索引擎设计与实现:C# 轻量级 Milvus 替代方案 内存计算、读写锁、并行检索 3 Elasticsearch 语义搜索实战:.NET 向量+关键词混合检索 ES 8.x、Dense Vector、Hybrid Search 4 CLIP 多模态搜索实战:.NET + Python 跨语言图片检索 OpenCLIP、Python Flask、跨模态 5 从内存到 ES:.NET 企业级向量检索架构演进之路 架构设计、性能优化、容灾策略) | 内存计算、读写锁、并行检索 |
| 3 | [Elasticsearch 语义搜索实战:.NET 向量+关键词混合检索](# 从原理到落地 BGE-M3、ONNX Runtime、Tokenizer 2 内存向量检索引擎设计与实现:C# 轻量级 Milvus 替代方案 内存计算、读写锁、并行检索 3 Elasticsearch 语义搜索实战:.NET 向量+关键词混合检索 ES 8.x、Dense Vector、Hybrid Search 4 CLIP 多模态搜索实战:.NET + Python 跨语言图片检索 OpenCLIP、Python Flask、跨模态 5 从内存到 ES:.NET 企业级向量检索架构演进之路 架构设计、性能优化、容灾策略) | ES 8.x、Dense Vector、Hybrid Search |
| 4 | [CLIP 多模态搜索实战:.NET + Python 跨语言图片检索](# 从原理到落地 BGE-M3、ONNX Runtime、Tokenizer 2 内存向量检索引擎设计与实现:C# 轻量级 Milvus 替代方案 内存计算、读写锁、并行检索 3 Elasticsearch 语义搜索实战:.NET 向量+关键词混合检索 ES 8.x、Dense Vector、Hybrid Search 4 CLIP 多模态搜索实战:.NET + Python 跨语言图片检索 OpenCLIP、Python Flask、跨模态 5 从内存到 ES:.NET 企业级向量检索架构演进之路 架构设计、性能优化、容灾策略) | OpenCLIP、Python Flask、跨模态 |
| 5 | [从内存到 ES:.NET 企业级向量检索架构演进之路](# 从原理到落地 BGE-M3、ONNX Runtime、Tokenizer 2 内存向量检索引擎设计与实现:C# 轻量级 Milvus 替代方案 内存计算、读写锁、并行检索 3 Elasticsearch 语义搜索实战:.NET 向量+关键词混合检索 ES 8.x、Dense Vector、Hybrid Search 4 CLIP 多模态搜索实战:.NET + Python 跨语言图片检索 OpenCLIP、Python Flask、跨模态 5 从内存到 ES:.NET 企业级向量检索架构演进之路 架构设计、性能优化、容灾策略) | 架构设计、性能优化、容灾策略 |
文章4:CLIP 多模态搜索实战:.NET + Python 跨语言图片检索
📝 文章信息
-
分类:计算机视觉 / 多模态AI / 跨模态检索 / .NET
-
标签 :
CLIP,OpenCLIP,多模态,跨模态检索,.NET,Python,Flask -
封面建议:CLIP 架构图 + .NET-Python 桥接示意图 + 图片搜索场景
-
为什么要使用 python ?因为 CLIP 转onnx一直有问题!!!!有方案的同步下。我也想 All in .net!
📖 章节大纲
1. 引言:多模态搜索的革命
-
传统搜索的局限
-
CLIP 的突破
-
应用场景
2. CLIP 原理深度解析
┌─────────────────┐ ┌─────────────────┐
│ Text Input │ │ Image Input │
│ "粉色连衣裙" │ │ [图片像素] │
└────────┬────────┘ └────────┬────────┘
│ │
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ Text Encoder │ │ Image Encoder │
│ (Transformer) │ │ (ViT) │
└────────┬────────┘ └────────┬────────┘
│ │
▼ ▼
┌─────────┐ ┌─────────┐
│ 512-dim │◄───────────────►│ 512-dim │
│ Vector │ 相似度计算 │ Vector │
└─────────┘ └─────────┘
3. Python CLIP 服务搭建(Flask)
from flask import Flask, request, jsonify
import torch
import open_clip
from PIL import Image
import io
import base64
app = Flask(__name__)
_model = None
_tokenizer = None
_preprocess = None
def load_model():
global _model, _tokenizer, _preprocess
if _model is not None:
return _model, _tokenizer, _preprocess
model_name = "ViT-B-32"
model, _, preprocess = open_clip.create_model_and_transforms(
model_name, pretrained=None, device="cpu"
)
weights = load_file('/models/open_clip_model.safetensors')
model.load_state_dict(weights)
model.eval()
_model = model
_tokenizer = open_clip.get_tokenizer(model_name)
_preprocess = preprocess
return _model, _tokenizer, _preprocess
@app.route('/encode_text', methods=['POST'])
def encode_text():
model, tokenizer, _ = load_model()
data = request.json or {}
text = data.get('text', '')
tokens = tokenizer([text])
with torch.no_grad():
text_features = model.encode_text(tokens)
text_features = text_features / text_features.norm(dim=-1, keepdim=True)
embedding = text_features[0].cpu().numpy().tolist()
return jsonify({
'text': text,
'embedding': embedding,
'dimension': len(embedding)
})
@app.route('/encode_image', methods=['POST'])
def encode_image():
model, _, preprocess = load_model()
data = request.json or {}
image_base64 = data.get('image_base64', '')
if ',' in image_base64:
image_base64 = image_base64.split(',')[1]
image_data = base64.b64decode(image_base64)
image = Image.open(io.BytesIO(image_data))
if image.mode != 'RGB':
image = image.convert('RGB')
image_tensor = preprocess(image).unsqueeze(0)
with torch.no_grad():
image_features = model.encode_image(image_tensor)
image_features = image_features / image_features.norm(dim=-1, keepdim=True)
embedding = image_features[0].cpu().numpy().tolist()
return jsonify({
'embedding': embedding,
'dimension': len(embedding)
})
if __name__ == '__main__':
app.run(host='0.0.0.0', port=5000)
4. .NET C# 客户端集成
4.1 CLIP 服务客户端(C#)
using System.Text;
using System.Text.Json;
namespace VectorSearch.Clients
{
/// <summary>
/// CLIP 向量服务客户端 - C# 实现
/// </summary>
public class ClipVectorClient : IDisposable
{
private readonly HttpClient _httpClient;
private readonly string _serviceUrl;
private readonly JsonSerializerOptions _jsonOptions;
public ClipVectorClient(string serviceUrl = "http://localhost:5000")
{
_serviceUrl = serviceUrl.TrimEnd('/');
_httpClient = new HttpClient
{
Timeout = TimeSpan.FromSeconds(60)
};
_jsonOptions = new JsonSerializerOptions
{
PropertyNameCaseInsensitive = true
};
}
/// <summary>
/// 批量文本编码
/// </summary>
public async Task<float[][]> EncodeTextBatchAsync(string[] texts)
{
if (texts == null || texts.Length == 0)
return Array.Empty<float[]>();
var requestBody = new { texts = texts };
var json = JsonSerializer.Serialize(requestBody);
var content = new StringContent(json, Encoding.UTF8, "application/json");
var response = await _httpClient.PostAsync($"{_serviceUrl}/encode_text_batch", content);
response.EnsureSuccessStatusCode();
var responseJson = await response.Content.ReadAsStringAsync();
var result = JsonSerializer.Deserialize<BatchEncodeResponse>(responseJson, _jsonOptions);
return result?.Embeddings ?? Array.Empty<float[]>();
}
/// <summary>
/// 图片编码(从 URL)
/// </summary>
public async Task<float[]> EncodeImageFromUrlAsync(string imageUrl)
{
// 使用完整浏览器 HTTP 头下载图片
var request = new HttpRequestMessage(HttpMethod.Get, imageUrl);
request.Headers.Add("User-Agent",
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36...");
request.Headers.Add("Accept", "image/webp,image/jpeg,image/apng,image/*,*/*;q=0.8");
request.Headers.Add("Referer", "https://example.com/");
var response = await _httpClient.SendAsync(request);
response.EnsureSuccessStatusCode();
var imageBytes = await response.Content.ReadAsByteArrayAsync();
return await EncodeImageFromBytesAsync(imageBytes);
}
/// <summary>
/// 图片编码(从 Base64)
/// </summary>
public async Task<float[]> EncodeImageFromBase64Async(string imageBase64)
{
if (string.IsNullOrWhiteSpace(imageBase64))
return Array.Empty<float>();
if (imageBase64.Contains(","))
{
imageBase64 = imageBase64.Split(',')[1];
}
var requestBody = new { image_base64 = imageBase64 };
var json = JsonSerializer.Serialize(requestBody);
var content = new StringContent(json, Encoding.UTF8, "application/json");
var response = await _httpClient.PostAsync($"{_serviceUrl}/encode_image", content);
response.EnsureSuccessStatusCode();
var responseJson = await response.Content.ReadAsStringAsync();
var result = JsonSerializer.Deserialize<EncodeResponse>(responseJson, _jsonOptions);
return result?.Embedding ?? Array.Empty<float>();
}
public void Dispose()
{
_httpClient?.Dispose();
}
}
public class EncodeResponse
{
public float[] Embedding { get; set; }
public int Dimension { get; set; }
}
public class BatchEncodeResponse
{
public float[][] Embeddings { get; set; }
public int Count { get; set; }
}
}
5. 多模态搜索 API 设计(C#)
[ApiController]
[Route("api/[controller]")]
public class MultimodalSearchController : ControllerBase
{
private readonly ClipVectorClient _clipClient;
private readonly IVectorStoreAsync _vectorStore;
[HttpPost("search")]
public async Task<ActionResult<SearchResponse>> Search([FromBody] SearchRequest request)
{
float[] imageVector = null;
float[] textVector = null;
// 1. 生成图片向量
if (!string.IsNullOrWhiteSpace(request.ImageUrl))
{
imageVector = await _clipClient.EncodeImageFromUrlAsync(request.ImageUrl);
}
else if (!string.IsNullOrWhiteSpace(request.ImageBase64))
{
imageVector = await _clipClient.EncodeImageFromBase64Async(request.ImageBase64);
}
// 2. 生成文本向量
if (!string.IsNullOrWhiteSpace(request.TextQuery))
{
var vectors = await _clipClient.EncodeTextBatchAsync(new[] { request.TextQuery });
textVector = vectors?.FirstOrDefault();
}
// 3. 执行搜索
var results = request.SearchMode switch
{
"image" => await _vectorStore.SearchByVectorAsync(imageVector, "image", request.TopK),
"text" => await _vectorStore.SearchByVectorAsync(textVector, "text", request.TopK),
"hybrid" => await HybridSearchAsync(imageVector, textVector, request.TopK),
_ => new List<SearchResult>()
};
return Ok(new SearchResponse
{
Results = results,
SearchMode = request.SearchMode
});
}
}